The Interplay Between Diseases and Adaptation in the Human Genome
Publisher
The University of Arizona.Rights
Copyright © is held by the author. Digital access to this material is made possible by the University Libraries, University of Arizona. Further transmission, reproduction, presentation (such as public display or performance) of protected items is prohibited except with permission of the author.Abstract
Human health is largely influenced by genetic architecture and living environments. Evolutionary processes, especially past adaptation to changing environments, shaped the genetic architecture and might deeply influence current disease risks. Advances in genomic sequencing dramatically improved our understanding of the genetic basis of diseases in the past ten years. Thousands of genes have been found to be associated with non-infectious and infectious diseases. However, the adaptation experienced by disease-associated genes is not well characterized, let alone the potential causal relationships between disease and genomic adaptation. Here, we use human genomic data to characterize the interplay between adaptation and human non-infectious diseases: what disease gene attributes may influence adaptation, and conversely how past adaptation may have shaped the landscape of disease variants. In the first chapter, I study an important prerequisite: accounting for confounders when studying adaptation in groups of genes, for example, disease genes, relative to the rest of the genome. I show how the lack of accounting for confounding factors other than the biological categories of interest can cause spurious results in the framework of Gene Set Enrichment Analysis (GSEA) of past adaptation. I propose a pipeline that specifically addresses the methodological problems of GSEA applied to recent adaptation in the form of selective sweeps. In the second chapter, I use the GSEA approach established in the first chapter to study the relationship between human non-infectious disease and recent adaptation. I specifically try to clarify the dominant causal direction of this relationship. Adaptation might increase the risk of diseases. For example, deleterious mutations may increase in frequency by hitchhiking with advantageous mutations and thus genes carrying deleterious variants may experience more recent adaptation compared to non-disease genes. Alternatively, pre-existing disease status associated with disease genes might affect the occurrence of selective sweeps at disease genes through the specific attributes that differentiate disease genes from non-disease genes. We find a deficit of selective sweeps in Mendelian non-infectious disease genes compared to non-disease genes in the human genome. This deficit is due to linked disease variants substantially slowing down adaptation at disease gene loci. This highlights a dominant causal relationship direction, without however excluding the possibility that selective sweeps have also increased the frequency of linked disease variants, albeit not at a sufficiently large number of genes to create a visible selective sweep enrichment to counteract the observed deficit, caused by the more predominant opposite action of disease variants slowing down linked adaptive variants. Thus, the picture that emerges from these results is that predominantly, some pre-existing specific attributes of disease genes have limited recent adaptation at their corresponding loci. Taking a step back to the definition of disease, disease is a phenotype that largely deviated from the optimum. What processes might increase the risk of having a largely deviated phenotype? Past strong adaptations, including those that took place a long evolutionary time ago, may have taken the associated phenotypes further from the current optimum compared to the hypothetical situation where these adaptations had not occurred. For example, for a protein whose optimal abundance is high in the current and most historical environments, past adaptation to one particular environment that lowers the abundance to the edge of the disease-causing value may increase the risk of association with diseases. Any mutation that slightly further decreases the abundance may push the abundance of the protein below the critical disease level. In this respect, past strong and rapid adaptation, as opposed to weak and slow adaptation, should have been particularly prone to cause pronounced shifts away from phenotypic optima. An important difficulty then is to first identify past strong adaptations in the human genome. This challenge presented an opportunity for me to connect my work on non-infectious diseases and adaptation to the work done by the rest of the lab on virus-driven adaptation. As mentioned, past strong adaptation should have been more prone to distance phenotypes away from the current optima. We happen to know that viruses drove strong adaptation in human host genomes during ancient viral epidemics, in genes that interact physically with viruses (VIPs for Virus-Interacting Proteins). This strong adaptation notably likely involved adaptive changes in gene expression and abundance, a phenotype that has been shown many times to be connected to genetic diseases. Although we do not have access to past changes in protein abundance directly, we can infer past changes in protein stability, the protein property that affects abundance of folded, functional proteins. In the third chapter, I therefore study host protein adaptations in response to viruses that were driven by changes in protein stability of VIPs. We find that past strong adaptation in VIPs mostly involved large stability changes. This result indicates that host VIP protein stability and thus protein abundance is a phenotype that was strongly selected during ancient viral epidemics. However, the optimal protein stability during past epidemics may be deviated from the current optimum after the selective pressure is weak or gone. In fact, we find compensatory evolution that keeps protein stability stable following viral epidemics in proviral VIPs which have broadly conserved non-immune host native functions. At the same time, specifically, many genetic diseases are known to carry disease variants that decrease thermodynamic stabilities. It is possible that strong past adaptation to viral infections that largely changed protein stability in VIPs increases the risk for following mutations to be deleterious. However, further research is needed to connect these virus-driven adaptive changes in VIP stability to the present occurrence of non-infectious disease variants at VIPs. This connection represents a logical further avenue of research to continue to characterize the relationship between non-infectious disease genes and adaptation.Type
Electronic Dissertationtext
Degree Name
Ph.D.Degree Level
doctoralDegree Program
Graduate CollegeEcology & Evolutionary Biology