Oxford Eprints
(1.101 recursos)
Oxford E-prints is a cross-disciplinary digital archive for research articles written by Oxford University authors. The repository has been developed as part of the SHERPA (Securing a Hybrid Environment for Research Preservation and Access) project and is running on eprints.org open archives software.
Mostrando recursos 1 - 20 de 138
1.
GOLDfont=symbol charset=fontspecific code=190 descr='[mdash]'Graphical Overview of Linkage Disequilibrium - R.Abecasis, G.; O. C.Cookson, W.
Summary: We describe a software package that provides a graphical summary of linkage disequilibrium in human genetic data. It allows for the analysis of family data and is well suited to the analysis of dense genetic maps.Availability: http://www.well.ox.ac.uk/asthma/GOLDContact: goncalo@well.ox.ac.uk
2.
Estimating the rate of molecular evolution: incorporating non-contemporaneous sequences into maximum likelihood phylogenies - Rambaut, Andrew
Motivation: TipDate is a program that will use sequences that have been isolated at different dates to estimate their rate of molecular evolution. The program provides a maximum likelihood estimate of the rate and also the associated date of the most recent common ancestor of the sequences, under a model which assumes a constant rate of substitution (molecular clock) but which accommodates the dates of isolation. Confidence intervals for these parameters are also estimated.Results: The approach was applied to a sample of 17 dengue virus serotype 4 sequences, isolated at dates ranging from 1956 to 1994. The rate of substitution...
3.
TREE-PUZZLE: maximum likelihood phylogenetic analysis using quartets and parallel computing - Schmidt, Heiko A.; Strimmer, Korbinian; Vingron, Martin; von Haeseler, Arndt
Summary: TREE-PUZZLE is a program package for quartet-based maximum-likelihood phylogenetic analysis (formerly PUZZLE, Strimmer and von Haeseler, Mol. Biol. Evol. , 13, 964-969, 1996) that provides methods for reconstruction, comparison, and testing of trees and models on DNA as well as protein sequences. To reduce waiting time for larger datasets the tree reconstruction part of the software has been parallelized using message passing that runs on clusters of workstations as well as parallel computers.Availability: http://www.tree-puzzle.de. The program is written in ANSI C. TREE-PUZZLE can be run on UNIX, Windows and Mac systems, including Mac OS X. To run the parallel...
4.
GENIE: estimating demographic history from molecular phylogenies - Pybus, O.G.; Rambaut, A.
Summary: GENIE implements a statistical framework for inferring the demographic history of a population from phylogenies that have been reconstructed from sampled DNA sequences. The methods are based on population genetic models known collectively as coalescent theory.Availability: GENIE is available from http://evolve.zoo.ox.ac.uk. All popular operating systems are supported.Contact: oliver.pybus@zoo.ox.ac.uk
5.
PCR designer for restriction analysis of various types of sequence mutation - Ke, Xiayi; Collins, Andrew; Ye, Shu
Summary: Restriction analysis is widely used to detect gene mutations such as insertions, deletions and single nucleotide polymorphisms (SNPs). Although such mutation sites sometimes present some natural restriction sites to differentiate the wild-type and mutant sequences, mismatches are often needed in order to create artificial restriction fragment length polymorphisms (RFLPs). In this report, a computer program is described that screens for suitable restriction enzymes, introducing mismatches where appropriate and when necessary, designs primers using the information of the selected restriction enzymes, their recognition sequence and locations as well as the information about the mismatches if any. The program, supported by...
7.
Gene finding with a hidden Markov model of genome structure and evolution - Pedersen, Jakob Skou; Hein, Jotun
Motivation: A growing number of genomes are sequenced. The differences in evolutionary pattern between functional regions can thus be observed genome-wide in a whole set of organisms. The diverse evolutionary pattern of different functional regions can be exploited in the process of genomic annotation. The modelling of evolution by the existing comparative gene finders leaves room for improvement.Results: A probabilistic model of both genome structure and evolution is designed. This type of model is called an Evolutionary Hidden Markov Model (EHMM), being composed of an HMM and a set of region-specific evolutionary models based on a phylogenetic tree. All parameters...
8.
Efficient selective screening of haplotype tag SNPs - Ke, Xiayi; Cardon, Lon R.
Abstract: Haplotypes defined by common single nucleotide polymorphisms (SNPs) have important implications for mapping of disease genes and human traits. Often only a small subset of the SNPs is sufficient to capture the full haplotype information. Such subsets of markers are called haplotype tagging SNPs (htSNPs). Although htSNPs can be identified by eye, efficient computer algorithms and flexible interactive software tools are required for large datasets such as the human genome haplotype map. We describe a java-based program, SNPtagger, which screens for minimal sets of SNP markers to represent given haplotypes according to various user requirements. The program offers several...
9.
A RAPID algorithm for sequence database comparisons: application to the identification of vector contamination in the EMBL databases - Miller, C; Gurd, J; Brass, A
Motivation: Word-matching algorithms such as BLAST are routinely used for sequence comparison. These algorithms typically use areas of matching words to seed alignments which are then used to assess the degree of sequence similarity. In this paper, we show that by formally separating the word-matching and sequence-alignment process, and using information about word frequencies to generate alignments and similarity scores, we can create a new sequence-comparison algorithm which is both fast and sensitive. The formal split between word searching and alignment allows users to select an appropriate alignment method without affecting the underlying similarity search. The algorithm has been used...
10.
Using guide trees to construct multiple-sequence evolutionary HMMs - Holmes, I.
Motivation: Score-based progressive alignment algorithms do dynamic programming on successive branches of a guide tree. The analogous probabilistic construct is an Evolutionary HMM. This is a multiple-sequence hidden Markov model (HMM) made by combining transducers (conditionally normalised Pair HMMs) on the branches of a phylogenetic tree.Methods: We present general algorithms for constructing an Evolutionary HMM from any Pair HMM and for doing dynamic programming to any Multiple-sequence HMM.Results: Our prototype implementation, Handel, is based on the Thorne-Kishino-Felsenstein evolutionary model and is benchmarked using structural reference alignments.Availability: Handel can be downloaded under GPL from www.biowiki.org/Handel
11.
Statistical evaluation of the Predictive Toxicology Challenge 2000-2001 - Toivonen, Hannu; Srinivasan, Ashwin; King, Ross D.; Kramer, Stefan; Helma, Christoph
Motivation: The development of in silico models to predict chemical carcinogenesis from molecular structure would help greatly to prevent environmentally caused cancers. The Predictive Toxicology Challenge (PTC) competition was organized to test the state-of-the-art in applying machine learning to form such predictive models.Results: Fourteen machine learning groups generated 111 models. The use of Receiver Operating Characteristic (ROC) space allowed the models to be uniformly compared regardless of the error cost function. We developed a statistical method to test if a model performs significantly better than random in ROC space. Using this test as criteria five models performed better than random...
12.
Characterizing proteolytic cleavage site activity using bio-basis function neural networks - Thomson, Rebecca; Hodgman, T. Charles; Yang, Zheng Rong; Doyle, Austin K.
Motivation: In protein chemistry, proteomics and biopharmaceutical development, there is a desire to know not only where a protein is cleaved by a protease, but also the susceptibility of its cleavage sites. The current tools for proteolytic cleavage prediction have often relied purely on regular expressions, or involve models that do not represent biological data well.Results: A novel methodology for characterizing proteolytic cleavage site activities has been developed, which incorporates two fundamental features: activity class prediction and the use of an amino acid similarity matrix for (non-parametric) neural learning. The first solved the problem of predicting proteolytic efficiency. The second...
13.
Modeling within-motif dependence for transcription factor binding site predictions - Zhou, Qing; Liu, Jun S.
Motivation: The position-specific weight matrix (PWM) model, which assumes that each position in the DNA site contributes independently to the overall protein-DNA interaction, has been the primary means to describe transcription factor binding site motifs. Recent biological experiments, however, suggest that there exists interdependence among positions in the binding sites. In order to exploit this interdependence to aid motif discovery, we extend the PWM model to include pairs of correlated positions and design a Markov chain Monte Carlo algorithm to sample in the model space. We then combine the model sampling step with the Gibbs sampling framework for de novo...
14.
Generalized least squares for the synthesis of correlated information - Berrington, A.; Cox, D. R.
This paper deals with the synthesis of information from different studies when there is lack of independence in some of the contrasts to be combined. This problem can arise in several different situations in both case-control studies and clinical trials. For efficient estimation we appeal to the method of generalized least squares to estimate the summary effect and its standard error. This method requires estimates of the covariances between those contrasts that are not independent. Although it is not possible to estimate the covariance between effects that have been adjusted for confounding factors we present a method for finding upper...
15.
MCMC genome rearrangement - Miklós, István
Motivation: As more and more genomes have been sequenced, genomic data is rapidly accumulating. Genome-wide mutations are believed more neutral than local mutations such as substitutions, insertions and deletions, therefore phylogenetic investigations based on inversions, transpositions and inverted transpositions are less biased by the hypothesis on neutral evolution. Although efficient algorithms exist for obtaining the inversion distance of two signed permutations, there is no reliable algorithm when both inversions and transpositions are considered. Moreover, different type of mutations happen with different rates, and it is not clear how to weight them in a distance based approach.Results: We introduce a Markov...
16.
A statistical analysis of N- and O-glycan linkage conformations fromcrystallographic data - Petrescu, Andrei J.; Petrescu, Stefana M.; Dwek, Raymond A.; Wormald, Mark R.
We have generated a database of 639 glycosidic linkage structures by an exhaustive survey of the available crystallographic data for isolated oligosaccharides, glycoproteins, and glycan-binding proteins. For isolated oligosaccharides there is relatively little crystallographic data available. A much larger number of glycoprotein and glycan-binding protein structures have now been solved in which two or more linked monosaccharides can be resolved. In the majority of these cases, only a few residues can be seen. Using the 639 glycosidic linkage structures, we have identified one or more distinct conformers for all the linkages. The O5-C1-O-C(x)' torsion angles for all these distinct conformers...
17.
Oligosaccharide analysis and molecular modeling of soluble forms of glycoproteins belonging to the Ly-6, scavenger receptor, and immunoglobulin superfamilies expressed in Chinese hamster ovary cells - Rudd, Pauline M.; Wormald, Mark R.; Harvey, David J.; Devasahayam, Mercy; McAlister, Mark S.B.; Brown, Marion H.; Davis, Simon J.; Barclay, A. Neil; Dwek, Raymond A.
Most cell surface molecules are glycoproteins consisting of linear arrays of globular domains containing stretches of amino acid sequence with similarities to regions in other proteins. These conserved regions form the basis for the classification of proteins into superfamilies. Recombinant soluble forms of six leukocyte antigens belonging to the Ly-6 (CD59), scavenger receptor (CD5), and immunoglobulin (CD2, CD48, CD4, and Thy-1) superfamilies were expressed in the same Chinese hamster ovary cell line, thus providing an opportunity to examine the extent to which N-linked oligosaccharide processing might vary in a superfamily-, domain-, or protein-dependent manner in a given cell. While we...
18.
CFTR expression does not influence glycosylation of an epitope-tagged MUC1 mucin in colon carcinoma cell lines - Reid, Colm J.; Burdick, Michael D.; Hollingsworth, Michael A.; Harris, Ann
The cause of the mucus clearance problems associated with cystic fibrosis remains poorly understood though it has been suggested that mucin hypersecretion, dehydration of mucins, and biochemical abnormalities in the glycosylation of mucins may be responsible. Since the biochemical and biophysical properties of a mucin are dependent on O-glycosylation, our aim was to evaluate the O-glycosylation of a single mucin gene product in matched pairs of cells that differed with respect to CFTR expression. An epitope-tagged MUC1 mucin cDNA (MUC1F) was used to detect variation in mucin glycosylation in stably transfected colon carcinoma cell lines HT29 and Caco2. The glycosylation...
19.
The glycan processing and site occupancy of recombinant Thy-1 is markedly affected by the presence of a glycosylphosphatidylinositol anchor - Devasahayam, Mercy; Catalino, Peter D.; Rudd, Pauline M.; Dwek, Raymond A.; Barclay, A. Neil
Thy-1 is a cell surface glycoprotein containing three N-linked glycosylation sites and a glycosylphosphatidylinositol (GPI) anchor. The effect of the anchor on its N-linked glycosylation was investigated by comparing the glycosylation of soluble recombinant Thy-1 (sThy-1) with that of recombinant GPI anchored Thy-1, both expressed in Chinese hamster ovary cells. The sThy-1 was produced in a variety of isoforms including some which lacked carbohydrate on all three sequons whereas the GPI anchored form appeared fully glycosylated like native Thy-1. This was surprising as the attachment of N-linked sugars occurs cotranslationally and it was not expected that the presence of a...
20.
Protein structure controls the processing of the N-linked oligosaccharides and glycosylphosphatidylinositol glycans of variant surface glycoproteins expressed in bloodstream form Trypanosoma brucei - Zitzmann, Nicole; Mehlert, Angela; Carrou, Sandra; M.Rudd, Pauline; A.J.Ferguson, Michael
The variant surface glycoproteins (VSGs) of Trypanosoma brucei are a family of homodimeric glycoproteins that adopt similar shapes. An individual trypanosome expresses one VSG at a time in the form of a dense protective monolayer on the plasma membrane. VSG genes are expressed from one of several polycistronic transcription units (expression sites) that contain several expression site associated genes. We used a transformed trypanosome clone expressing two different VSGs (VSG121 and VSG221) from the same expression site (that of VSG221) to establish whether the genotype of the trypanosome clone or the VSG structure itself controls VSG N-linked oligosaccharide and GPI...