Mostrando recursos 1 - 20 de 943

  1. Assessing differences in legislators’ revealed preferences: A case study on the 107th U.S. Senate

    Lofland, Chelsea L.; Rodríguez, Abel; Moser, Scott
    Roll call data are widely used to assess legislators’ preferences and ideology, as well as test theories of legislative behavior. In particular, roll call data is often used to determine whether the revealed preferences of legislators are affected by outside forces such as party pressure, minority status or procedural rules. This paper describes a Bayesian hierarchical model that extends existing spatial voting models to test sharp hypotheses about differences in preferences using posterior probabilities associated with such hypotheses. We use our model to investigate the effect of the change of party majority status during the 107th U.S. Senate on the...

  2. Inference for social network models from egocentrically sampled data, with application to understanding persistent racial disparities in HIV prevalence in the US

    Krivitsky, Pavel N.; Morris, Martina
    Egocentric network sampling observes the network of interest from the point of view of a set of sampled actors, who provide information about themselves and anonymized information on their network neighbors. In survey research, this is often the most practical, and sometimes the only, way to observe certain classes of networks, with the sexual networks that underlie HIV transmission being the archetypal case. Although methods exist for recovering some descriptive network features, there is no rigorous and practical statistical foundation for estimation and inference for network models from such data. We identify a subclass of exponential-family random graph models (ERGMs)...

  3. Bayesian nonhomogeneous Markov models via Pólya-Gamma data augmentation with applications to rainfall modeling

    Holsclaw, Tracy; Greene, Arthur M.; Robertson, Andrew W.; Smyth, Padhraic
    Discrete-time hidden Markov models are a broadly useful class of latent-variable models with applications in areas such as speech recognition, bioinformatics, and climate data analysis. It is common in practice to introduce temporal nonhomogeneity into such models by making the transition probabilities dependent on time-varying exogenous input variables via a multinomial logistic parametrization. We extend such models to introduce additional nonhomogeneity into the emission distribution using a generalized linear model (GLM), with data augmentation for sampling-based inference. However, the presence of the logistic function in the state transition model significantly complicates parameter inference for the overall model, particularly in a...

  4. A multivariate mixed hidden Markov model for blue whale behaviour and responses to sound exposure

    DeRuiter, Stacy L.; Langrock, Roland; Skirbutas, Tomas; Goldbogen, Jeremy A.; Calambokidis, John; Friedlaender, Ari S.; Southall, Brandon L.
    Characterization of multivariate time series of behaviour data from animal-borne sensors is challenging. Biologists require methods to objectively quantify baseline behaviour, and then assess behaviour changes in response to environmental stimuli. Here, we apply hidden Markov models (HMMs) to characterize blue whale movement and diving behaviour, identifying latent states corresponding to three main underlying behaviour states: shallow feeding, travelling, and deep feeding. The model formulation accounts for inter-whale differences via a computationally efficient discrete random effect, and measures potential effects of experimental acoustic disturbance on between-state transition probabilities. We identify clear differences in blue whale disturbance response depending on the...

  5. Biomass prediction using a density-dependent diameter distribution model

    Schliep, Erin M.; Gelfand, Alan E.; Clark, James S.; Tomasek, Bradley J.
    Prediction of aboveground biomass, particularly at large spatial scales, is necessary for estimating global-scale carbon sequestration. Since biomass can be measured only by sacrificing trees, total biomass on plots is never observed. Rather, allometric equations are used to convert individual tree diameter to individual biomass, perhaps with noise. The values for all trees on a plot are then summed to obtain a derived total biomass for the plot. Then, with derived total biomasses for a collection of plots, regression models, using appropriate environmental covariates, are employed to attempt explanation and prediction. Not surprisingly, when out-of-sample validation is examined, such a...

  6. Efficient estimation of age-specific social contact rates between men and women

    van de Kassteele, Jan; van Eijkeren, Jan; Wallinga, Jacco
    Social contact patterns reveal with whom individuals tend to socialize, and therefore to whom they transmit respiratory infections. We infer highly detailed age-specific contact rates between the sexes using a hierarchical Bayesian model that smooths while simultaneously guaranteeing the inherent reciprocity of contact rates. Application of this approach to social contact data from a large prospective survey confirms a tendency that people, especially children and adolescents, mostly contact other people of their own age and sex, and reveals that women have more contact with children than men. These findings imply different exposure patterns between the two sexes for specific age...

  7. Functional time series models for ultrafine particle distributions

    Fischer, Heidi J.; Zhang, Qunfang; Zhu, Yifang; Weiss, Robert E.
    We propose Bayesian functional mixed effect time series models to explain the impact of engine idling on ultrafine particle (UFP) counts inside school buses. UFPs are toxic to humans and school engines emit particles primarily in the UFP size range. As school buses idle at bus stops, UFPs penetrate into cabins through cracks, doors, and windows. Counts increase over time at a size dependent rate once the engine turns on. How UFP counts inside buses vary by particle size over time and under different idling conditions is not yet well understood. We model UFP counts at a given time using...

  8. Partially time-varying coefficient proportional hazards models with error-prone time-dependent covariates—an application to the AIDS Clinical Trial Group 175 data

    Song, Xiao; Wang, Li
    Due to cost and time considerations, interest has focused on identifying surrogate markers that could be substituted for the clinical endpoint, time to an event of interest, in evaluation of treatment efficacy. Joint models are often used to assess the effect of surrogate markers and treatment. Motivated by recent works studying the AIDS Clinical Trial Group (ACTG) 175 data, we propose a partially time-varying coefficient proportional hazards model for modeling the relationship between the hazard of failure and time-dependent and time-independent covariates. The time-varying coefficients are approximated by polynomial splines, and the corrected score and conditional score approaches are adopted...

  9. Electricity price dependence in New York State zones: A robust detrended correlation approach

    Dupuis, Debbie J.
    The cost of electricity varies across the zones of the New York State electric system. While fair and open access to the electrical grid is sought, we show that residents currently do not equally benefit, or suffer, from price changes. Upcoming major investments in the grid offer an opportunity to rectify these inequalities, but only if we understand the price-change propagation dynamics for the current underlying infrastructure. We study these dynamics, estimating the partial correlations between changes in electricity prices in connected zones. We develop and investigate a robust exponentially weighted correlation estimator that performs well in the presence of...

  10. Sensitivity analysis for an unobserved moderator in RCT-to-target-population generalization of treatment effects

    Nguyen, Trang Quynh; Ebnesajjad, Cyrus; Cole, Stephen R.; Stuart, Elizabeth A.
    In the presence of treatment effect heterogeneity, the average treatment effect (ATE) in a randomized controlled trial (RCT) may differ from the average effect of the same treatment if applied to a target population of interest. If all treatment effect moderators are observed in the RCT and in a dataset representing the target population, then we can obtain an estimate for the target population ATE by adjusting for the difference in the distribution of the moderators between the two samples. This paper considers sensitivity analyses for two situations: (1) where we cannot adjust for a specific moderator $V$ observed in...

  11. Forecasting seasonal influenza with a state-space SIR model

    Osthus, Dave; Hickmann, Kyle S.; Caragea, Petruţa C.; Higdon, Dave; Del Valle, Sara Y.
    Seasonal influenza is a serious public health and societal problem due to its consequences resulting from absenteeism, hospitalizations, and deaths. The overall burden of influenza is captured by the Centers for Disease Control and Prevention’s influenza-like illness network, which provides invaluable information about the current incidence. This information is used to provide decision support regarding prevention and response efforts. Despite the relatively rich surveillance data and the recurrent nature of seasonal influenza, forecasting the timing and intensity of seasonal influenza in the U.S. remains challenging because the form of the disease transmission process is uncertain, the disease dynamics are only...

  12. A penalized Cox proportional hazards model with multiple time-varying exposures

    Wang, Chenkun; Liu, Hai; Gao, Sujuan
    In recent pharmacoepidemiology research, the increasing use of electronic medication dispensing data provides an unprecedented opportunity to examine various health outcomes associated with long-term medication usage. Often, patients may take multiple types of medications intended for the same medical condition and the medication exposure status and intensity may vary over time, posing challenges to the statistical modeling of such data. In this article, we propose a penalized Cox proportional hazards (PH) model with multiple functional covariates and potential interaction effects. We also consider constrained coefficient functions to ensure a diminishing medication effect over time. Hypothesis testing of interaction effect and...

  13. A statistical framework for data integration through graphical models with application to cancer genomics

    Zhang, Yuping; Ouyang, Zhengqing; Zhao, Hongyu
    Recent advances in high-throughput biotechnologies have generated various types of genetic, genomic, epigenetic, transcriptomic and proteomic data across different biological conditions. It is likely that integrating data from diverse experiments may lead to a more unified and global view of biological systems and complex diseases. We present a coherent statistical framework for integrating various types of data from distinct but related biological conditions through graphical models. Specifically, our statistical framework is designed for modeling multiple networks with shared regulatory mechanisms from heterogeneous high-dimensional datasets. The performance of our approach is illustrated through simulations and its applications to cancer genomics.

  14. Static and roving sensor data fusion for spatio-temporal hazard mapping with application to occupational exposure assessment

    Ludwig, Guilherme; Chu, Tingjin; Zhu, Jun; Wang, Haonan; Koehler, Kirsten
    Rapid technological advances have drastically improved the data collection capacity in occupational exposure assessment. However, advanced statistical methods for analyzing such data and drawing proper inference remain limited. The objectives of this paper are (1) to provide new spatio-temporal methodology that combines data from both roving and static sensors for data processing and hazard mapping across space and over time in an indoor environment, and (2) to compare the new method with the current industry practice, demonstrating the distinct advantages of the new method and the impact on occupational hazard assessment and future policy making in environmental health as well...

  15. A mixed-effects model for incomplete data from labeling-based quantitative proteomics experiments

    Chen, Lin S.; Wang, Jiebiao; Wang, Xianlong; Wang, Pei
    In mass spectrometry (MS) based quantitative proteomics research, the emerging iTRAQ (isobaric tag for relative and absolute quantitation) and TMT (tandem mass tags) techniques have been widely adopted for high throughput protein profiling. In a typical iTRAQ/TMT proteomics study, samples are grouped into batches, and each batch is processed by one multiplex experiment, in which the abundances of thousands of proteins/peptides in a batch of samples can be measured simultaneously. The multiplex labeling technique greatly enhances the throughput of protein quantification. However, the technical variation across different iTRAQ/TMT multiplex experiments is often large due to the dynamic nature of MS...

  16. Covariate-adaptive clustering of exposures for air pollution epidemiology cohorts

    Keller, Joshua P.; Drton, Mathias; Larson, Timothy; Kaufman, Joel D.; Sandler, Dale P.; Szpiro, Adam A.
    Cohort studies in air pollution epidemiology aim to establish associations between health outcomes and air pollution exposures. Statistical analysis of such associations is complicated by the multivariate nature of the pollutant exposure data as well as the spatial misalignment that arises from the fact that exposure data are collected at regulatory monitoring network locations distinct from cohort locations. We present a novel clustering approach for addressing this challenge. Specifically, we present a method that uses geographic covariate information to cluster multi-pollutant observations and predict cluster membership at cohort locations. Our predictive $k$-means procedure identifies centers using a mixture model and...

  17. Multivariate spatial mapping of soil water holding capacity with spatially varying cross-correlations

    Messick, Rachel M.; Heaton, Matthew J.; Hansen, Neil
    Irrigation in agriculture mitigates the adverse effects of drought and improves crop production and yield. Still, water scarcity remains a persistent issue and water resources need to be used responsibly. To improve water use efficiency, precision irrigation is emerging as an approach where farmers can vary the application of water according to within field variation in soil and topographic conditions. As a precursor, methods to characterize spatial variation of soil hydraulic properties are needed. One such property is soil water holding capacity (WHC). This analysis develops a multivariate spatial model for predicting WHC across a field at various soil depths...

  18. Gene network reconstruction using global-local shrinkage priors

    Leday, Gwenaël G. R.; de Gunst, Mathisca C. M.; Kpogbezan, Gino B.; van der Vaart, Aad W.; van Wieringen, Wessel N.; van de Wiel, Mark A.
    Reconstructing a gene network from high-throughput molecular data is an important but challenging task, as the number of parameters to estimate easily is much larger than the sample size. A conventional remedy is to regularize or penalize the model likelihood. In network models, this is often done locally in the neighborhood of each node or gene. However, estimation of the many regularization parameters is often difficult and can result in large statistical uncertainties. In this paper we propose to combine local regularization with global shrinkage of the regularization parameters to borrow strength between genes and improve inference. We employ a...

  19. Modelling individual migration patterns using a Bayesian nonparametric approach for capture–recapture data

    Matechou, Eleni; Caron, François
    We present a Bayesian nonparametric approach for modelling wildlife migration patterns using capture–recapture (CR) data. Arrival times of individuals are modelled in continuous time and assumed to be drawn from a Poisson process with unknown intensity function, which is modelled via a flexible nonparametric mixture model. The proposed CR framework allows us to estimate the following: (i) the total number of individuals that arrived at the site, (ii) their times of arrival and departure, and hence their stopover duration, and (iii) the density of arrival times, providing a smooth representation of the arrival pattern of the individuals at the site....

  20. Randomization inference for stepped-wedge cluster-randomized trials: An application to community-based health insurance

    Ji, Xinyao; Fink, Gunther; Robyn, Paul Jacob; Small, Dylan S.
    National health insurance schemes are generally impractical in low-income countries due to limited resources and low organizational capacity. In response to such obstacles, community-based health insurance (CBHI) schemes have emerged over the past 20 years. CBHIs are designed to reduce the financial burden generated by unanticipated treatment cost among individuals falling sick, and thus are expected to make health care more affordable. In this paper, we investigate whether CBHI schemes effectively protect individuals against large financial shocks using a stepped-wedge cluster-randomized design on data from a CBHI program rolled out in rural Burkina Faso. We investigate statistical properties of the...

Aviso de cookies: Usamos cookies propias y de terceros para mejorar nuestros servicios, para análisis estadístico y para mostrarle publicidad. Si continua navegando consideramos que acepta su uso en los términos establecidos en la Política de cookies.