Show simple item record

dc.contributor.advisorSubbian, Vignesh
dc.contributor.authorPungitore, Sarah
dc.creatorPungitore, Sarah
dc.date.accessioned2025-09-13T01:30:55Z
dc.date.available2025-09-13T01:30:55Z
dc.date.issued2025
dc.identifier.citationPungitore, Sarah. (2025). Next-Generation Computational Phenotyping with Large Language Models (Doctoral dissertation, University of Arizona, Tucson, USA).
dc.identifier.urihttp://hdl.handle.net/10150/678466
dc.description.abstractIn this dissertation, we presented a critical re-examination of computational phenotyping, a foundational activity in biomedical informatics that supports cohort discovery, observational research, and clinical quality improvement. Despite the development of numerous computable phenotypes across a wide range of clinical outcomes and conditions, the field continues to rely on labor-intensive methods involving manual review and algorithm design. In response, we introduced novel phenotyping methods using Large Language Models (LLMs) to reduce human burden and achieve synergy between human expertise and machine intelligence. These methodological enhancements enabled successful application of LLMs to phenotyping processes previously requiring substantial human oversight. Our work lays the groundwork for the next-generation of computational phenotyping methods, redefining how clinical knowledge is extracted and applied in the era of artificial intelligence. Each of the studies presented in this dissertation supported the progression of next-generation phenotyping methods by assessing the application of LLMs to computational phenotyping tasks. In the first study, we presented PHEONA (Evaluation of PHEnotyping for Observational Health Data), an evaluation framework specifically for LLMs. The components of this framework allowed us to thoroughly evaluate the suitability and feasibility of LLMs for various computational phenotyping tasks. In the second study, we developed a companion framework, SHREC (SHifting to language model-based REal-world Computational phenotyping), that outlined both an end-to-end phenotyping pipeline and the steps necessary to advance next-generation phenotyping methods. Using this framework, we assessed LLMs for concept classification and phenotyping of encounters, which were both individual steps within the end-to-end pipeline. Finally, in the third study, to further evaluate performance deficiencies in applying LLMs to these tasks, we enhanced PHEONA to include an assessment of faulty reasoning within LLM responses.
dc.language.isoen
dc.publisherThe University of Arizona.
dc.rightsCopyright © is held by the author. Digital access to this material is made possible by the University Libraries, University of Arizona. Further transmission, reproduction, presentation (such as public display or performance) of protected items is prohibited except with permission of the author.
dc.rights.urihttp://rightsstatements.org/vocab/InC/1.0/
dc.subjectComputational Phenotyping
dc.subjectElectronic Health Records
dc.subjectGenerative Artificial Intelligence
dc.subjectLarge Language Models
dc.titleNext-Generation Computational Phenotyping with Large Language Models
dc.typetext
dc.typeElectronic Dissertation
thesis.degree.grantorUniversity of Arizona
thesis.degree.leveldoctoral
dc.contributor.committeememberSecomb, Timothy
dc.contributor.committeememberBethard, Steven
dc.description.releaseRelease after 09/05/2027
thesis.degree.disciplineGraduate College
thesis.degree.disciplineApplied Mathematics
thesis.degree.namePh.D.


Files in this item

Thumbnail
Name:
azu_etd_22449_sip1_m.pdf
Embargo:
2027-09-05
Size:
5.545Mb
Format:
PDF

This item appears in the following Collection(s)

Show simple item record