Publisher
The University of Arizona.Rights
Copyright © is held by the author. Digital access to this material is made possible by the University Libraries, University of Arizona. Further transmission, reproduction or presentation (such as public display or performance) of protected items is prohibited except with permission of the author.Abstract
We develop a new class of computationally feasible stochastic models for statistical analysis of genetic sequence evolution and inference of properties of the underlying substitution processes in the context of maximum likelihood framework. Existing models for evolution of protein coding sequences allow site to site variation in non-synonymous substitution rates, but assume that the rate of synonymous substitutions is constant for all sites. New models provide a rigorous statistical framework for testing the hypothesis of synonymous rate constancy, and enable a host of data exploration and analysis tools. For several indicative data sets, the constancy assumption is shown to be violated, and some possible explanations are given. We also present an algorithm for improving efficiency of maximum likelihood evaluations, and discuss HyPhy--a user friendly and publicly distributed software implementation of our methods.Type
textDissertation-Reproduction (electronic)
Degree Name
Ph.D.Degree Level
doctoralDegree Program
Graduate CollegeApplied Mathematics