Recursos de colección
Project Euclid (Hosted at Cornell University Library) (192.977 recursos)
Institute of Mathematical Statistics Collections
Institute of Mathematical Statistics Collections
Morris L. (“Joe”) Eaton is one of the preeminent theoretical statisticians of the past 40 years, and his work has had a substantial impact on the way many statistical issues are currently viewed. His pioneering and fundamental research has spanned many areas with particular emphasis on multivariate statistics, decision theory, probability inequalities, invariance, and the foundations of Bayesian inference. Perhaps less well-known are his substantial contributions to many applied problems in clinical trials and other topics in biostatistics. This volume is dedicated to and in honor of him, from his collaborators, colleagues, friends and former students. Their contributions to this...
Kariya, Takeaki
In this paper we formulate a corporate bond (CB) pricing model for deriving the term structure of default probabilities (TSDP) and the recovery rate (RR) for each pair of industry factor and credit rating grade, and these derived TSDP and RR are regarded as what investors imply in forming CB prices in the market at each time. A unique feature of this formulation is that the model allows each firm to run several business lines corresponding to some industry categories, which is typical in reality. In fact, treating all the cross-sectional CB prices simultaneously under a credit correlation structure at...
Muirhead, Robb J.; Şoaita, Adina I.
This paper explores an approach to Bayesian sample size determination in clinical trials. The approach falls into the category of what is often called “proper Bayesian”, in that it does not mix frequentist concepts with Bayesian ones. A criterion for a “successful trial” is defined in terms of a posterior probability, its probability is assessed using the marginal distribution of the data, and this probability forms the basis for choosing sample sizes. We illustrate with a standard problem in clinical trials, that of establishing superiority of a new drug over a control.
Diaconis, Persi; Holmes, Susan; Shahshahani, Mehrdad
We develop algorithms for sampling from a probability distribution on a submanifold embedded in $\mathbb{R}^{n}$. Applications are given to the evaluation of algorithms in ‘Topological Statistics’; to goodness of fit tests in exponential families and to Neyman’s smooth test. This article is partially expository, giving an introduction to the tools of geometric measure theory.
Jiang, Yindeng; Perlman, Michael D.
For a bivariate random vector $(X,Y)$, symmetry conditions are presented that yield stochastic orderings among $|X|$, $|Y|$, $|\max(X,Y)|$, and $|\min(X,Y)|$. Partial extensions of these results for multivariate random vectors $(X_{1},\ldots ,X_{n})$ are also given.
Regazzini, Eugenio
Bruno de Finetti was one of the most convinced advocates of finitely additive probabilities. The present work describes the intellectual process that led him to support that stance and provides a detailed account both of the first paper by de Finetti on the subject and of the ensuing correspondence with Maurice Fréchet. Moreover, the analysis is supplemented by a useful picture of de Finetti’s interactions with the international scientific community at that time, when he elaborated his subjectivistic conception of probability.
Neath, Ronald C.
The Expectation-Maximization (EM) algorithm (Dempster, Laird and Rubin, 1977) is a popular method for computing maximum likelihood estimates (MLEs) in problems with missing data. Each iteration of the algorithm formally consists of an E-step: evaluate the expected complete-data log-likelihood given the observed data, with expectation taken at current parameter estimate; and an M-step: maximize the resulting expression to find the updated estimate. Conditions that guarantee convergence of the EM sequence to a unique MLE were found by Boyles (1983) and Wu (1983). In complicated models for high-dimensional data, it is common to encounter an intractable integral in the E-step. The...
Tan, Aixin; Jones, Galin L.; Hobert, James P.
A Markov chain is geometrically ergodic if it converges to its invariant distribution at a geometric rate in total variation norm. We study geometric ergodicity of deterministic and random scan versions of the two-variable Gibbs sampler. We give a sufficient condition which simultaneously guarantees both versions are geometrically ergodic. We also develop a method for simultaneously establishing that both versions are subgeometrically ergodic. These general results allow us to characterize the convergence rate of two-variable Gibbs samplers in a particular family of discrete bivariate distributions.
Geyer, Charles J.
If the log likelihood is approximately quadratic with constant Hessian, then the maximum likelihood estimator (MLE) is approximately normally distributed. No other assumptions are required. We do not need independent and identically distributed data. We do not need the law of large numbers (LLN) or the central limit theorem (CLT). We do not need sample size going to infinity or anything going to infinity.
¶Presented here is a combination of Le Cam style theory involving local asymptotic normality (LAN) and local asymptotic mixed normality (LAMN) and Cramér style theory involving derivatives and Fisher information. The main tool is convergence in law...
For more than thirty years, Jon A. Wellner has made outstanding contributions to several very active and important areas of statistics and probability. His results have been especially influential in semiparametric statistics, estimation and testing problems under shape constraints, empirical processes theory (both classical and abstract), survival analysis, biostatistics, bootstrap, probability in Banach spaces and high-dimensional probability. Among the main features of Jon’s work are his exceptional taste and ability to identify research problems in statistics that are both challenging and important, his deep understanding of the purely mathematical side of statistics, his extraordinary curiosity and interest in the work...
Walther, Guenther
Large-scale multiple testing problems require the simultaneous assessment of many p-values. This paper compares several methods to assess the evidence in multiple binomial counts of p-values: the maximum of the binomial counts after standardization (the “higher-criticism statistic”), the maximum of the binomial counts after a log-likelihood ratio transformation (the “Berk–Jones statistic”), and a newly introduced average of the binomial counts after a likelihood ratio transformation. Simulations show that the higher criticism statistic has a superior performance to the Berk–Jones statistic in the case of very sparse alternatives (sparsity coefficient $\beta \gtrapprox 0.75$), while the situation is reversed for $\beta \lessapprox...
van de Geer, Sara; Lederer, Johannes
We study high-dimensional linear models and the $\ell_1$-penalized least squares estimator, also known as the Lasso estimator. In literature, oracle inequalities have been derived under restricted eigenvalue or compatibility conditions. In this paper, we complement this with entropy conditions which allow one to improve the dual norm bound, and demonstrate how this leads to new oracle inequalities. The new oracle inequalities show that a smaller choice for the tuning parameter and a trade-off between $\ell_1$-norms and small compatibility constants are possible. This implies, in particular for correlated design, improved bounds for the prediction error of the Lasso estimator as compared...
Ruymgaart, Frits; Wang, Jing; Wei, Shih-Hsuan
The general asymptotic distribution theory for the functional regression model in Ruymgaart et al. [Some asymptotic theory for functional regression and classification (2009) Texas Tech University] simplifies considerably if an extra assumption on the random regressor is made. In the special case where the regressor is a stochastic process on the unit interval, Johannes [Privileged communication (2008)] assumes the regressor to be stationary, in which case the eigenfunctions of their covariance operator turn out to be known, so that only the eigenvalues are to be estimated. In the present paper we will also assume the eigenvectors to be known, but...
Rosenbaum, Mathieu; Tsybakov, Alexandre B.
We consider the regression model with observation error in the design:
\begin{eqnarray*}y&=&X\theta^*+\xi,\\ Z&=&X+\Xi.\end{eqnarray*}
¶Here the random vector $y\in\mathbb{R}^n$ and the random $n\times p$ matrix $Z$ are observed, the $n\times p$ matrix $X$ is unknown, $\Xi$ is an $n\times p$ random noise matrix, $\xi\in\mathbb{R}^n$ is a random noise vector, and $\theta^*$ is a vector of unknown parameters to be estimated. We consider the setting where the dimension $p$ can be much larger than the sample size $n$ and $\theta^*$ is sparse. Because of the presence of the noise matrix $\Xi$, the commonly used Lasso and Dantzig selector are unstable. An alternative procedure called...
Pollard, David
Kagan and Shepp [
The American Statistician
59 (2005) 54–56] presented an elegant example of a mixture model for which an insufficient statistic preserves Fisher information. This note uses the regularity property of differentiability in quadratic mean to provide another explanation for the phenomenon they observed. Some connections with Le Cam’s theory for convergence of experiments are noted.
Massart, Pascal; Rossignol, Raphaël
Nemirovski’s inequality states that given independent and centered at expectation random vectors $X_{1},\ldots,X_{n}$ with values in $\ell^p(\mathbb{R}^d)$, there exists some constant $C(p,d)$ such that
\[\mathbb{E}\Vert S_n\Vert _p^2\le C(p,d)\sum_{i=1}^{n}\mathbb{E}\Vert X_i\Vert _p^2.\]
¶Furthermore $C(p,d)$ can be taken as $\kappa(p\wedge \log(d))$. Two cases were studied further in [
Am. Math. Mon.
117(2) (2010) 138–160]: general finite-dimensional Banach spaces and the special case $\ell^{\infty}(\mathbb{R}^{d})$. We show that in these two cases, it is possible to replace the quantity $\sum_{i=1}^n\mathbb{E}\Vert X_i\Vert _p^2$ by a smaller one without changing the order of magnitude of the constant when $d$ becomes large. In the spirit of [
Am. Math. Mon.
117(2) (2010) 138–160], our...
Mason, David M.; Swanepoel, Jan W. H.
We use results from modern empirical process theory to establish a uniform in bandwidth central limit theorem, laws of the iterated logarithm and Glivenko–Cantelli theorem for kernel distribution function estimators.
Kruijer, Willem; van der Vaart, Aad
We give bounds on the concentration of (pseudo) posterior distributions, both for correct and misspecified models. The bounds are derived using the information inequality, entropy estimates, and empirical process methods.
Koltchinskii, Vladimir
A problem of estimation of a large Hermitian nonnegatively definite matrix of trace 1 (a density matrix of a quantum system) motivated by quantum state tomography is studied. The estimator is based on a modified least squares method suitable in the case of models with random design with known design distributions. The bounds on Hilbert-Schmidt error of the estimator, including low rank oracle inequalities, have been proved. The proofs rely on Bernstein type inequalities for sums of independent random matrices.
Hall, W. J.; Wellner, Jon A.
We consider two variations on a Lehmann alternatives to symmetry-at-zero semiparametric model, with a real parameter $\theta$ quantifying skewness and a symmetric-at-0 distribution as a nuisance function. We show that a test of symmetry based on the signed log-rank statistic [A signed log-rank test of symmetry at zero (2011) University of Rochester] is asymptotically efficient in these models, derive its properties under local alternatives and present efficiency results relative to other signed-rank tests. We develop efficient estimation of the primary parameter in each model, using model-specific estimates of the nuisance function, and provide a method for choosing between the two...