Recursos de colección
Buonaccorsi, John
A surprising number of important problems can be cast in the framework of estimating a mean and variance using data arising from a two-stage structure. The first stage is a random sampling of ''units'' with some quantity of interest associated with the unit. The second stage produces an estimate of that quantity and usually, but not always, an estimated standard error, which may change considerably across units. Heteroscedasticity in the estimates over different units can arise for a number of reasons, including variation associated with the unit and changing sampling effort over units. This paper presents a broad discussion of...
Lencina, Viviana B.; Singer, Julio M.
We consider exact F tests for the hypothesis of null random factor effect in the presence of interaction under the two factor mixed models involved in the mixed models controversy. We show that under the constrained parameter (CP) model, even in unbalanced data situations, MSB/MSE (in the usual ANOVA notation) follows an exact F distribution when the null hypothesis holds. We also obtain an exact F test for what is generally (and erroneously) assumed to be an equivalent hypothesis under the unconstrained parameter (UP) model. For unbalanced data, such a corresponding test statistic does not coincide with MSB/MSAB (the test...
Heerschap, Nico; Willenborg, Leon
Changes in circumstances put pressure on Statistics Netherlands (SN) to redesign the way its statistics are produced. Key developments are: the changing needs of data-users, growing competition, pressure to reduce the survey burden on enterprises, emerging new technologies and methodologies and, first and foremost, the need for more efficiency because of budget cuts. ¶ This paper describes how SN, and especially its business statistics, can adapt to these new circumstances. We envisage an optimum situation as one with a single standardised production line for all statistics and a central data repository at its core. This single production line is supported...
Hugo, Graeme
International migration has reached unprecedented scale, diversity and political, economic, social and demographic significance in Asia over the last decade. Despite this data collection of migrant stocks and flows remains very limited in most Asian countries. Accordingly, policy making on migration in the region lacks an evidence base and is influenced by interest groups, anecdotal evidence and prejudice. This paper argues that the heightened security consciousness since 911 together with the development of efficient computer based collection and analysis of migration data systems has created a propitious environment for bringing about a parametric improvement in data collection on international migration...
Frosini, Benito V.
This paper aims at displaying a synthetic view of the historical development and the current research concerning causal relationships, starting from the Aristotelian doctrine of causes, following with the main philosophical streams until the middle of the twentieth century, and commenting on the present intensive research work in the statistical domain. The philosophical survey dwells upon various concepts of cause, and some attempts towards picking out spurious causes. Concerning statistical modelling, factorial models and directed acyclic graphs are examined and compared. Special attention is devoted to randomization and pseudo-randomization (for observational studies) in view of avoiding the effect of possible...
Chen, Zhao-Guo; Ho Wu, Ka
For a target socio-economic variable, two sources of data with different precisions and collecting frequencies may be available. Typically, the less frequent data (e.g., annual report or census) are more reliable and are considered as benchmarks. The process of using them to adjust the more frequent and less reliable data (e.g., repeated monthly surveys) is called benchmarking. ¶ In this paper, we show the relationship among three types of benchmarking methods in the literature, namely the Denton (original and modified), the regression, and the signal-extraction methods. A new method called ''quasi-linear regression'' is proposed under the multiplicative assumption. The numerical...
Febres Cordero, Maria M.; Márquez, Bernardo
This article presents a statistical approach to assess the coherence of official results of referendum processes. The statistical analysis described is divided in four phases, according to the methodology used and the corresponding results: ¶ (1) Initial Study, (2) Quantification of irregular certificates of election, (3) Identification of irregular voting centers and (4) Estimation of recall referendum results. ¶ The technique of cluster analysis is applied to address the issue of heterogeneity of the parishes with respect to their political preferences. The Venezuelan recall referendum 2004 is the case study we used to apply the proposed methodology, based on the...
Willenborg, Leon; van den Hout, Ardo
A new method for protecting microdata is introduced. This method, perturbation of unsafe combinations (Peruco), eliminates unsafe combinations in the records in which they occur, by replacing them with safe combinations that are also consistent with the other values in the remainder of the records. Consistency is relative to a set of micro-edits that is supposed to be given along with the microdata. A deterministic optimization model is discussed, as well as possible extensions.
Clements, Kenneth W.; Izan, Izan H.Y.; Antony Selvanathan, E.
The stochastic approach is a new way of viewing index numbers in which uncertainty and statistical ideas play a central role. Rather than just providing a single number for the rate of inflation, the stochastic approach provides the whole probability distribution of inflation. This paper reviews the key elements of the approach and then discusses its early history, including some previously overlooked links with Fisher's work contained in his book The Making of Index Numbers. We then consider some more recent developments, including Diewert's well-known critique of the stochastic approach, and provide responses to his criticisms. We also provide a...
Debón, Ana; Montes, Francisco; Sala, Ramón
The nonparametric graduation of mortality data aims to estimate death rates by carrying out a smoothing of the crude rates obtained directly from original data. The main difference with regard to parametric models is that the assumption of an age-dependent function is unnecessary, which is advantageous when the information behind the model is unknown, as one cause of error is often the choice of an inappropriate model. This paper reviews the various alternatives and presents their application to mortality data from the Valencia Region, Spain. The comparison leads us to the conclusion that the best model is a smoothing by...
Reis, Edna A.; Salazar, Esther; Gamerman, Dani
Hyperparameter estimation in dynamic linear models leads to inference that is not available analytically. Recently, the most common approach is through MCMC approximations. A number of sampling schemes that have been proposed in the literature are compared. They basically differ in their blocking structure. In this paper, comparison between the most common schemes is performed in terms of different efficiency criteria, including efficiency ratio and processing time. A sample of time series was simulated to reflect different relevant features such as series length and system volatility.
Gustafson, Karl
A matrix trigonometry developed chiefly by this author during the past 40 years has interesting applications to certain situations in statistics. The key conceptual entity in this matrix trigonometry is the matrix (maximal) turning angle. Associated entities (originally so-named by this author) are the matrix antieigenvalues and corresponding antieigenvectors upon which the matrix obtains its critical turning angles. Because this trigonometry is the natural one for linear operators and matrices, it also is the natural one for matrix statistics.
Radhakrishna Rao, C.
Books on linear models and multivariate analysis generally include a chapter on matrix algebra, quite rightly so, as matrix results are used in the discussion of statistical methods in these areas. During recent years a number of papers have appeared where statistical results derived without the use of matrix theorems have been used to prove some matrix results which are used to generate other statistical results. This may have some pedagogical value. It is not, however, suggested that prior knowledge of matrix theory is not necessary for studying statistics. It is intended to show that a judicious use of statistical...
Davies, Simon L.; Neath, Andrew A.; Cavanaugh, Joseph E.
Model selection criteria often arise by constructing unbiased or approximately unbiased estimators of measures known as expected overall discrepancies (Linhart & Zucchini, 1986, p. 19). Such measures quantify the disparity between the true model (i.e., the model which generated the observed data) and a fitted candidate model. For linear regression with normally distributed error terms, the "corrected" Akaike information criterion and the "modified" conceptual predictive statistic have been proposed as exactly unbiased estimators of their respective target discrepancies. We expand on previous work to additionally show that these criteria achieve minimum variance within the class of unbiased estimators.
Bertino, Salvatore
After defining the concept of representativeness of a random sample, the author proposes a measure of how much the observed sample represents its parent distribution. This measure is called Representativeness Index. The same measure, seen as a function of a sample and of a distribution, will be called Representativeness Function. For a given sample it provides the value of the index for the different distributions under examination, and for a given distribution it provides a measure of the representativeness of its possible samples. Such Representativeness Function can be used in an inferential framework just as the likelihood function, since it...
Estevao, Victor M.; Särndal, Carl-Erik
In the last decade, calibration estimation has developed into an important field of research in survey sampling. Calibration is now an important methodological instrument in the production of statistics. Several national statistical agencies have developed software designed to compute calibrated weights based on auxiliary information available in population registers and other sources. ¶ This paper reviews some recent progress and offers some new perspectives. Calibration estimation can be used to advantage in a range of different survey conditions. This paper examines several situations, including estimation for domains in one-phase sampling, estimation for two-phase sampling, and estimation for two-stage sampling with...
Tjetjep, Annelies; Seneta, Eugene
Financial returns (log-increments) data, Y_{t}, t=1,2,..., are treated as a stationary process, with the common distribution at each time point being not necessarily symmetric. ¶ We consider as possible models for the common distribution four instances of the General Normal Variance-Mean Model (GNVM), which is described by $Y|V \sim N(a(b+V),{c^2}V+d^2)$ where V is a non-negative random variable and a, b, c and d are constants. When V is Gamma distributed and d=0, Y has the skewed Variance-Gamma distribution (VG). When V follows a Half Normal distribution and c=0, Y has the well-known Skew Normal (SN) distribution. We also consider two...
Shen, Jianfa
No consistent and reliable annual data series on the urbanization level for provincial regions of China is available. Making use of urban population data from the 1982 and 2000 population censuses, this paper estimates an annual data series of the urbanization level for provincial regions using an estimation approach developed on the basis of a conceptual model of dual-track urbanization. Based on such estimated new urban data of provincial regions, the major trends of urbanization in Chinese provinces and the relationship between urbanization and economic development are analysed for the period\linebreak 1982-2000.
Sazak,, Hakan S.; Tiku, Moti L.; Qamarul Islam, M.
In regression models, the design variable has primarily been treated as a nonstochastic variable. In numerous situations, however, the design variable is stochastic. The estimation and hypothesis testing problems in such situations are considered. Real life examples are given.
Horgan, Jane M.
When Dalenius provided a set of equations for the determination of stratum boundaries of a single auxiliary variable, that minimise the variance of the Horvitz-Thompson estimator of the mean or total under Neyman allocation for a fixed sample size, he pointed out that, though mathematically correct, those equations are troublesome to solve. Since then there has been a proliferation of approximations of an iterative nature, or otherwise cumbersome, tendered for this problem; many of these approximations assume a uniform distribution within strata, and, in the case of skewed populations, that all strata have the same relative variation. What seems to...