Show simple item record

dc.contributor.advisorHu, Chengchengen
dc.contributor.authorVasquez, Monica M.en
dc.creatorVasquez, Monica M.en
dc.date.accessioned2017-09-26T20:42:01Z
dc.date.available2017-09-26T20:42:01Z
dc.date.issued2017
dc.identifier.urihttp://hdl.handle.net/10150/625637
dc.description.abstractThe study of circulating biomarkers and their association with disease outcomes has become progressively complex due to advances in the measurement of these biomarkers through multiplex technologies. Although the availability of numerous serum biomarkers is highly promising, multiplex assays present statistical challenges due to the high dimensionality of these data. In this dissertation, three studies are presented that address these challenges using L1 penalized regression methods. In the first part of the dissertation, an extensive simulation study is performed for the logistic regression model that compares the Least Absolute Shrinkage and Selection Operator (LASSO) method with five LASSO-type methods given scenarios that are present in serum biomarker research, such as high correlation between biomarkers, weak associations with the outcome, and sparse number of true signals. Results show that choice of optimal LASSO-type method is dependent on data structure and should be guided by the research objective. Methods are then applied to the Tucson Epidemiological Study of Airway Obstructive Disease (TESAOD) study for the identification of serum biomarkers of overweight and obesity. Measurement of serum biomarkers using multiplex technologies may be more variable as compared to traditional single biomarker methods. Measurement error may induce bias in parameter estimation and complicate the variable selection process. In the second part of the dissertation, an existing measurement error correction method for penalized linear regression with L1 penalty has been adapted to accommodate validation data on a randomly selected subset of the study sample. A simulation study and analysis of TESAOD data demonstrate that the proposed approach improves variable selection and reduces bias in parameter estimation for validation data as small as 10 percent of the study sample. In the third part of the dissertation, a measurement error correction method that utilizes validation data is proposed for the penalized logistic regression model with the L1 penalty. A simulation study and analysis of TESAOD data are used to evaluate the proposed method. Results show an improvement in variable selection.
dc.language.isoen_USen
dc.publisherThe University of Arizona.en
dc.rightsCopyright © is held by the author. Digital access to this material is made possible by the University Libraries, University of Arizona. Further transmission, reproduction or presentation (such as public display or performance) of protected items is prohibited except with permission of the author.en
dc.subjectBiomarkersen
dc.subjectHigh-Dimensionalen
dc.subjectLASSOen
dc.subjectMeasurement Erroren
dc.subjectObesityen
dc.subjectOverweighten
dc.titlePenalized Regression Methods in the Study of Serum Biomarkers for Overweight and Obesityen_US
dc.typetexten
dc.typeElectronic Dissertationen
thesis.degree.grantorUniversity of Arizonaen
thesis.degree.leveldoctoralen
dc.contributor.committeememberHu, Chengchengen
dc.contributor.committeememberRoe, Deniseen
dc.contributor.committeememberBillheimer, Deanen
dc.contributor.committeememberGuerra, Stefanoen
thesis.degree.disciplineGraduate Collegeen
thesis.degree.disciplineBiostatisticsen
thesis.degree.namePh.D.en
refterms.dateFOA2018-09-11T23:07:10Z
html.description.abstractThe study of circulating biomarkers and their association with disease outcomes has become progressively complex due to advances in the measurement of these biomarkers through multiplex technologies. Although the availability of numerous serum biomarkers is highly promising, multiplex assays present statistical challenges due to the high dimensionality of these data. In this dissertation, three studies are presented that address these challenges using L1 penalized regression methods. In the first part of the dissertation, an extensive simulation study is performed for the logistic regression model that compares the Least Absolute Shrinkage and Selection Operator (LASSO) method with five LASSO-type methods given scenarios that are present in serum biomarker research, such as high correlation between biomarkers, weak associations with the outcome, and sparse number of true signals. Results show that choice of optimal LASSO-type method is dependent on data structure and should be guided by the research objective. Methods are then applied to the Tucson Epidemiological Study of Airway Obstructive Disease (TESAOD) study for the identification of serum biomarkers of overweight and obesity. Measurement of serum biomarkers using multiplex technologies may be more variable as compared to traditional single biomarker methods. Measurement error may induce bias in parameter estimation and complicate the variable selection process. In the second part of the dissertation, an existing measurement error correction method for penalized linear regression with L1 penalty has been adapted to accommodate validation data on a randomly selected subset of the study sample. A simulation study and analysis of TESAOD data demonstrate that the proposed approach improves variable selection and reduces bias in parameter estimation for validation data as small as 10 percent of the study sample. In the third part of the dissertation, a measurement error correction method that utilizes validation data is proposed for the penalized logistic regression model with the L1 penalty. A simulation study and analysis of TESAOD data are used to evaluate the proposed method. Results show an improvement in variable selection.


Files in this item

Thumbnail
Name:
azu_etd_15757_sip1_m.pdf
Size:
6.357Mb
Format:
PDF

This item appears in the following Collection(s)

Show simple item record