Contributions to Bioinformatic Analytics for Gene Expression Data using Categorical Data Analysis to Inform on Gene Set Level Signals
Author
Aberasturi, DillonIssue Date
2022Keywords
Categorical Data Analysisdesign effect
enrichment
gene ontology
odds ratio
single subject study
Advisor
Piegorsch, Walter W.
Metadata
Show full item recordPublisher
The University of Arizona.Rights
Copyright © is held by the author. Digital access to this material is made possible by the University Libraries, University of Arizona. Further transmission, reproduction, presentation (such as public display or performance) of protected items is prohibited except with permission of the author.Abstract
This document will focus on methods to combine single subject study results to analyze a singlecohort of subjects or compare 2 independent cohorts. The single subject studies used within will be of a paired sample design where two samples are taken from an individual, one for each of two differing conditions without replicates. The proposed methods for using single subject studies to make cohort-level inferences all work on the gene set level and it is beyond the scope of this document to discuss methods for transcript-level analyses. By summarizing the results of each single subject study within 2 × 2 contingency tables, the problem of how to combine single subject studies results to analyze one or two cohorts shall be placed within the realm of categorical data analysis. As such, the solutions found within appeal to well-known constructs for analyzing categorical data, such as design effects and the approximate normality of natural log odds-ratios. The 1st of the papers uses the approximate normality of natural log odds-ratios to combine information across subjects within their respective cohorts and then contrast the two cohort-level signals. The 2nd paper alters the contingency tables used to summarize single subject study results to account for the direction of altered expression for the transcripts. Leveraging the additional information coming from the direction of altered expression of transcripts improves the performance of inferences made to contrast two cohorts of subjects. The 3rd of the papers presented in this dissertation combines single subject studies results for the purpose of identifying enriched gene sets within a single cohort of subjects, while also properly accounting for inter-gene correlations which have been shown to lead to inflated false positive rates. Taken together, the three papers presented within this document illustrate applications of categorical data analysis to the burgeoning field of transcriptomic single subject studies.Type
textElectronic Dissertation
Degree Name
Ph.D.Degree Level
doctoralDegree Program
Graduate CollegeStatistics
