Recursos de colección
Project Euclid (Hosted at Cornell University Library) (202.748 recursos)
Statistical Science
Statistical Science
Zeitouni, Ofer
Sathamangalam Ranga Iyengar Srinivasa (Raghu) Varadhan was born in Chennai (then Madras). He received his Bachelor’s and Master’s degree from Presidency College, Madras, and his PhD from the Indian Statistical Institute in Kolkata, in 1963. That same year he came to the Courant Institute, New York University as a postdoc, and remained there as faculty member throughout his career. He has received numerous prizes and recognitions, including the Abel Prize in 2007, the US National Medal of Science in 2010 and honorary degrees from the Chennai Mathematical Institute, Duke University, the Indian Statistical Institute, Kolkata and the University of Paris.
¶...
Stigler, Stephen M.
Roughly half of Bayes’s famous essay was written by Richard Price, including the Appendix with all of the numerical examples. A study of this Appendix reveals Price (1) unusually for the time, felt it necessary to allow in his analysis for a hypothesis having been suggested by the same data used in its analysis, (2) was motivated (covertly in 1763, overtly in 1767) to undertake the study to refute David Hume on miracles, and (3) displayed a remarkable sense of collegiality in scientific controversy that should stand as a model for the present day. Price’s analysis of the posterior in...
Bai, Shuyang; Taqqu, Murad S.
Under long memory, the limit theorems for normalized sums of random variables typically involve a positive integer called “Hermite rank.” There is a different limit for each Hermite rank. From a statistical point of view, however, we argue that a rank other than one is unstable, whereas, a rank equal to one is stable. We provide empirical evidence supporting this argument. This has important consequences. Assuming a higher-order rank when it is not really there usually results in underestimating the order of the fluctuations of the statistic of interest. We illustrate this through various examples involving the sample variance, the...
Gustafson, Paul; McCandless, Lawrence C.
Sensitivity analysis is used widely in statistical work. Yet the notion and properties of sensitivity parameters are often left quite vague and intuitive. Working in the Bayesian paradigm, we present a definition of when a sensitivity parameter is “pure,” and we discuss the implications of a parameter meeting or not meeting this definition. We also present a diagnostic with which the extent of violations of purity can be visualized.
Kendall, Michelle; Ayabina, Diepreye; Xu, Yuanwei; Stimson, James; Colijn, Caroline
Reconstructing who infected whom is a central challenge in analysing epidemiological data. Recently, advances in sequencing technology have led to increasing interest in Bayesian approaches to inferring who infected whom using genetic data from pathogens. The logic behind such approaches is that isolates that are nearly genetically identical are more likely to have been recently transmitted than those that are very different. A number of methods have been developed to perform this inference. However, testing their convergence, examining posterior sets of transmission trees and comparing methods’ performance are challenged by the fact that the object of inference—the transmission tree—is a...
Bretó, Carles
Likelihood-based statistical inference has been considered in most scientific fields involving stochastic modeling. This includes infectious disease dynamics, where scientific understanding can help capture biological processes in so-called mechanistic models and their likelihood functions. However, when the likelihood of such mechanistic models lacks a closed-form expression, computational burdens are substantial. In this context, algorithmic advances have facilitated likelihood maximization, promoting the study of novel data-motivated mechanistic models over the last decade. Reviewing these models is the focus of this paper. In particular, we highlight statistical aspects of these models like overdispersion, which is key in the interface between nonlinear infectious...
Kypraios, Theodore; O’Neill, Philip D.
The vast majority of models for the spread of communicable diseases are parametric in nature and involve underlying assumptions about how the disease spreads through a population. In this article, we consider the use of Bayesian nonparametric approaches to analysing data from disease outbreaks. Specifically we focus on methods for estimating the infection process in simple models under the assumption that this process has an explicit time-dependence.
Birrell, Paul J.; De Angelis, Daniela; Presanis, Anne M.
In recent years, the role of epidemic models in informing public health policies has progressively grown. Models have become increasingly realistic and more complex, requiring the use of multiple data sources to estimate all quantities of interest. This review summarises the different types of stochastic epidemic models that use evidence synthesis and highlights current challenges.
Gibson, Gavin J.; Streftaris, George; Thong, David
Model criticism is a growing focus of research in stochastic epidemic modelling, following the successful addressing of model fitting and parameter estimation via powerful computationally intensive statistical methods in recent decades. In this paper, we consider a variety of stochastic representations of epidemic outbreaks, with emphasis on individual-based continuous-time models, and review the range of model comparison and assessment approaches currently applied. We highlight some of the factors that can serve to impede checking and criticism of epidemic models such as lack of replication, partial observation of processes, lack of prior knowledge on parameters in competing models, the nonnested nature...
McKinley, Trevelyan J.; Vernon, Ian; Andrianakis, Ioannis; McCreesh, Nicky; Oakley, Jeremy E.; Nsubuga, Rebecca N.; Goldstein, Michael; White, Richard G.
Approximate Bayesian Computation (ABC) and other simulation-based inference methods are becoming increasingly used for inference in complex systems, due to their relative ease-of-implementation. We briefly review some of the more popular variants of ABC and their application in epidemiology, before using a real-world model of HIV transmission to illustrate some of challenges when applying ABC methods to high-dimensional, computationally intensive models. We then discuss an alternative approach—history matching—that aims to address some of these issues, and conclude with a comparison between these different methodologies.
Kypraios, Theodore; Minin, Vladimir N.
Nieto-Reyes, Alicia; Battey, Heather
Gijbels, Irène; Nagy, Stanislav
In this paper, we provide an elaboration on the desirable properties of statistical depths for functional data. Although a formal definition has been put forward in the literature, there are still several unclarities to be tackled, and further insights to be gained. Herein, a few interesting connections between the wanted properties are found. In particular, it is demonstrated that the conditions needed for some desirable properties to hold are extremely demanding, and virtually impossible to be met for common depths. We establish adaptations of these properties which prove to be still sensible, and more easily met by common functional depths.
Aldous, David
In a simple model for sports, the probability A beats B is a specified function of their difference in strength. One might think this would be a staple topic in Applied Probability textbooks (like the Galton–Watson branching process model, for instance) but it is curiously absent. Our first purpose is to point out that the model suggests a wide range of questions, suitable for “undergraduate research” via simulation but also challenging as professional research. Our second, more specific, purpose concerns Elo-type rating algorithms for tracking changing strengths. There has been little foundational research on their accuracy, despite a much-copied “30...
Ripamonti, Enrico; Lloyd, Chris; Quatto, Piero
The $2\times2$ table is the simplest of data structures yet it is of immense practical importance. It is also just complex enough to provide a theoretical testing ground for general frequentist methods. Yet after 70 years of debate, its correct analysis is still not settled. Rather than recount the entire history, our review is motivated by contemporary developments in likelihood and testing theory as well as computational advances. We will look at both conditional and unconditional tests. Within the conditional framework, we explain the relationship of Fisher’s test with variants such as mid-$p$ and Liebermeister’s test, as well as modern...
Samartsidis, Pantelis; Montagna, Silvia; Johnson, Timothy D.; Nichols, Thomas E.
Neuroimaging meta-analysis is an area of growing interest in statistics. The special characteristics of neuroimaging data render classical meta-analysis methods inapplicable and therefore new methods have been developed. We review existing methodologies, explaining the benefits and drawbacks of each. A demonstration on a real dataset of emotion studies is included. We discuss some still-open problems in the field to highlight the need for future research.
Small, Dylan S.; Tan, Zhiqiang; Ramsahai, Roland R.; Lorch, Scott A.; Brookhart, M. Alan
The instrumental variables (IV) method provides a way to estimate the causal effect of a treatment when there are unmeasured confounding variables. The method requires a valid IV, a variable that is independent of the unmeasured confounding variables and is associated with the treatment but which has no effect on the outcome beyond its effect on the treatment. An additional assumption often made is deterministic monotonicity, which says that for each subject, the level of the treatment that a subject would take is a monotonic increasing function of the level of the IV. However, deterministic monotonicity is sometimes not realistic....
Yan, Xiaohan; Bien, Jacob
Demanding sparsity in estimated models has become a routine practice in statistics. In many situations, we wish to require that the sparsity patterns attained honor certain problem-specific constraints. Hierarchical sparse modeling (HSM) refers to situations in which these constraints specify that one set of parameters be set to zero whenever another is set to zero. In recent years, numerous papers have developed convex regularizers for this form of sparsity structure, which arises in many areas of statistics including interaction modeling, time series analysis, and covariance estimation. In this paper, we observe that these methods fall into two frameworks, the group...
Rosenbaum, Paul R.
The general structure of evidence factors is examined in terms of the knit product of two permutation groups. An observational or nonrandomized study of treatment effects has two evidence factors if it permits two (nearly) independent tests of the null hypothesis of no treatment effect and two (nearly) independent sensitivity analyses for those tests. Either of the two tests may be biased by nonrandom treatment assignment, but certain biases that would invalidate one test would have no impact on the other, so if the two tests concur, then some aspects of biased treatment assignment have been partially addressed. Expressed in...
Jeong, Jaehong; Jun, Mikyoung; Genton, Marc G.
Statistical models used in geophysical, environmental, and climate science applications must reflect the curvature of the spatial domain in global data. Over the past few decades, statisticians have developed covariance models that capture the spatial and temporal behavior of these global data sets. Though the geodesic distance is the most natural metric for measuring distance on the surface of a sphere, mathematical limitations have compelled statisticians to use the chordal distance to compute the covariance matrix in many applications instead, which may cause physically unrealistic distortions. Therefore, covariance functions directly defined on a sphere using the geodesic distance are needed....