Show simple item record

dc.contributor.authorRan, Di
dc.contributor.authorDaye, Z. John
dc.date.accessioned2017-09-14T22:56:27Z
dc.date.available2017-09-14T22:56:27Z
dc.date.issued2017-07-27
dc.identifier.citationGene expression variability and the analysis of large-scale RNA-seq studies with the MDSeq 2017, 45 (13):e127 Nucleic Acids Researchen
dc.identifier.issn0305-1048
dc.identifier.issn1362-4962
dc.identifier.doi10.1093/nar/gkx456
dc.identifier.urihttp://hdl.handle.net/10150/625531
dc.description.abstractRapidly decreasing cost of next-generation sequencing has led to the recent availability of large-scale RNA-seq data, that empowers the analysis of gene expression variability, in addition to gene expression means. In this paper, we present the MDSeq, based on the coefficient of dispersion, to provide robust and computationally efficient analysis of both gene expression means and variability on RNA-seq counts. The MDSeq utilizes a novel reparametrization of the negative binomial to provide flexible generalized linear models (GLMs) on both the mean and dispersion. We address challenges of analyzing large-scale RNA-seq data via several new developments to provide a comprehensive toolset that models technical excess zeros, identifies outliers efficiently, and evaluates differential expressions at biologically interesting levels. We evaluated performances of the MDSeq using simulated data when the ground truths are known. Results suggest that the MDSeq often outperforms current methods for the analysis of gene expression mean and variability. Moreover, the MDSeq is applied in two real RNA-seq studies, in which we identified functionally relevant genes and gene pathways. Specifically, the analysis of gene expression variability with the MDSeq on the GTEx human brain tissue data has identified pathways associated with common neurodegenerative disorders when gene expression means were conserved.
dc.language.isoenen
dc.publisherOXFORD UNIV PRESSen
dc.relation.urlhttps://academic.oup.com/nar/article-lookup/doi/10.1093/nar/gkx456en
dc.rights© The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research. This is an Open Access article distributed under the terms of the Creative Commons Attribution License.en
dc.titleGene expression variability and the analysis of large-scale RNA-seq studies with the MDSeqen
dc.typeArticleen
dc.contributor.departmentUniv Arizona, Mel & Enid Zuckerman Coll Publ Hlthen
dc.identifier.journalNucleic Acids Researchen
dc.description.noteOpen access journalen
dc.description.collectioninformationThis item from the UA Faculty Publications collection is made available by the University of Arizona with support from the University of Arizona Libraries. If you have questions, please contact us at repository@u.library.arizona.edu.en
dc.eprint.versionFinal published versionen
refterms.dateFOA2018-06-18T08:51:10Z
html.description.abstractRapidly decreasing cost of next-generation sequencing has led to the recent availability of large-scale RNA-seq data, that empowers the analysis of gene expression variability, in addition to gene expression means. In this paper, we present the MDSeq, based on the coefficient of dispersion, to provide robust and computationally efficient analysis of both gene expression means and variability on RNA-seq counts. The MDSeq utilizes a novel reparametrization of the negative binomial to provide flexible generalized linear models (GLMs) on both the mean and dispersion. We address challenges of analyzing large-scale RNA-seq data via several new developments to provide a comprehensive toolset that models technical excess zeros, identifies outliers efficiently, and evaluates differential expressions at biologically interesting levels. We evaluated performances of the MDSeq using simulated data when the ground truths are known. Results suggest that the MDSeq often outperforms current methods for the analysis of gene expression mean and variability. Moreover, the MDSeq is applied in two real RNA-seq studies, in which we identified functionally relevant genes and gene pathways. Specifically, the analysis of gene expression variability with the MDSeq on the GTEx human brain tissue data has identified pathways associated with common neurodegenerative disorders when gene expression means were conserved.


Files in this item

Thumbnail
Name:
Ran_Gene_Expression_Variability.pdf
Size:
864.0Kb
Format:
PDF
Description:
FInal Published Version

This item appears in the following Collection(s)

Show simple item record