We are upgrading the repository! A content freeze is in effect until December 6th, 2024 - no new submissions will be accepted; however, all content already published will remain publicly available. Please reach out to repository@u.library.arizona.edu with your questions, or if you are a UA affiliate who needs to make content available soon. Note that any new user accounts created after September 22, 2024 will need to be recreated by the user in November after our migration is completed.
Large scale proteomic studies create novel privacy considerations
Name:
s41598-023-34866-6.pdf
Size:
2.333Mb
Format:
PDF
Description:
Final Published Version
Author
Hill, A.C.Guo, C.
Litkowski, E.M.
Manichaikul, A.W.
Yu, B.
Konigsberg, I.R.
Gorbet, B.A.
Lange, L.A.
Pratte, K.A.
Kechris, K.J.
DeCamp, M.
Coors, M.
Ortega, V.E.
Rich, S.S.
Rotter, J.I.
Gerzsten, R.E.
Clish, C.B.
Curtis, J.L.
Hu, X.
Obeidat, M.-E.
Morris, M.
Loureiro, J.
Ngo, D.
O’Neal, W.K.
Meyers, D.A.
Bleecker, E.R.
Hobbs, B.D.
Cho, M.H.
Banaei-Kashani, F.
Bowler, R.P.
Affiliation
University of ArizonaIssue Date
2023-06-07
Metadata
Show full item recordPublisher
Nature ResearchCitation
Hill, A.C., Guo, C., Litkowski, E.M. et al. Large scale proteomic studies create novel privacy considerations. Sci Rep 13, 9254 (2023). https://doi.org/10.1038/s41598-023-34866-6Journal
Scientific ReportsRights
© The Author(s) 2023. This article is licensed under a Creative Commons Attribution 4.0 International License.Collection Information
This item from the UA Faculty Publications collection is made available by the University of Arizona with support from the University of Arizona Libraries. If you have questions, please contact us at repository@u.library.arizona.edu.Abstract
Privacy protection is a core principle of genomic but not proteomic research. We identified independent single nucleotide polymorphism (SNP) quantitative trait loci (pQTL) from COPDGene and Jackson Heart Study (JHS), calculated continuous protein level genotype probabilities, and then applied a naïve Bayesian approach to link SomaScan 1.3K proteomes to genomes for 2812 independent subjects from COPDGene, JHS, SubPopulations and InteRmediate Outcome Measures In COPD Study (SPIROMICS) and Multi-Ethnic Study of Atherosclerosis (MESA). We correctly linked 90–95% of proteomes to their correct genome and for 95–99% we identify the 1% most likely links. The linking accuracy in subjects with African ancestry was lower (~ 60%) unless training included diverse subjects. With larger profiling (SomaScan 5K) in the Atherosclerosis Risk Communities (ARIC) correct identification was > 99% even in mixed ancestry populations. We also linked proteomes-to-proteomes and used the proteome only to determine features such as sex, ancestry, and first-degree relatives. When serial proteomes are available, the linking algorithm can be used to identify and correct mislabeled samples. This work also demonstrates the importance of including diverse populations in omics research and that large proteomic datasets (> 1000 proteins) can be accurately linked to a specific genome through pQTL knowledge and should not be considered unidentifiable. © 2023, The Author(s).Note
Open access journalISSN
2045-2322PubMed ID
37286633Version
Final Published Versionae974a485f413a2113503eed53cd6c53
10.1038/s41598-023-34866-6
Scopus Count
Collections
Except where otherwise noted, this item's license is described as © The Author(s) 2023. This article is licensed under a Creative Commons Attribution 4.0 International License.
Related articles
- Protein prediction for trait mapping in diverse populations.
- Authors: Schubert R, Geoffroy E, Gregga I, Mulford AJ, Aguet F, Ardlie K, Gerszten R, Clish C, Van Den Berg D, Taylor KD, Durda P, Johnson WC, Cornell E, Guo X, Liu Y, Tracy R, Conomos M, Blackwell T, Papanicolaou G, Lappalainen T, Mikhaylova AV, Thornton TA, Cho MH, Gignoux CR, Lange L, Lange E, Rich SS, Rotter JI, NHLBI TOPMed Consortium, Manichaikul A, Im HK, Wheeler HE
- Issue date: 2022
- Genome-wide association study of homocysteine in African Americans from the Jackson Heart Study, the Multi-Ethnic Study of Atherosclerosis, and the Coronary Artery Risk in Young Adults study.
- Authors: Raffield LM, Ellis J, Olson NC, Duan Q, Li J, Durda P, Pankratz N, Keating BJ, Wassel CL, Cushman M, Wilson JG, Gross MD, Tracy RP, Rich SS, Reiner AP, Li Y, Willis MS, Lange EM, Lange LA
- Issue date: 2018 Mar
- Identifying novel genes for amyotrophic lateral sclerosis by integrating human brain proteomes with genome-wide association data.
- Authors: Gu XJ, Su WM, Dou M, Jiang Z, Duan QQ, Wang H, Ren YL, Cao B, Wang Y, Chen YP
- Issue date: 2023 Aug
- Comparison of Proteomic Assessment Methods in Multiple Cohort Studies.
- Authors: Raffield LM, Dang H, Pratte KA, Jacobson S, Gillenwater LA, Ampleford E, Barjaktarevic I, Basta P, Clish CB, Comellas AP, Cornell E, Curtis JL, Doerschuk C, Durda P, Emson C, Freeman CM, Guo X, Hastie AT, Hawkins GA, Herrera J, Johnson WC, Labaki WW, Liu Y, Masters B, Miller M, Ortega VE, Papanicolaou G, Peters S, Taylor KD, Rich SS, Rotter JI, Auer P, Reiner AP, Tracy RP, Ngo D, Gerszten RE, O'Neal WK, Bowler RP, NHLBI Trans-Omics for Precision Medicine (TOPMed) Consortium
- Issue date: 2020 Jun
- Genome-wide association study and meta-analysis identify loci associated with ventricular and supraventricular ectopy.
- Authors: Napier MD, Franceschini N, Gondalia R, Stewart JD, Méndez-Giráldez R, Sitlani CM, Seyerle AA, Highland HM, Li Y, Wilhelmsen KC, Yan S, Duan Q, Roach J, Yao J, Guo X, Taylor KD, Heckbert SR, Rotter JI, North KE, Reiner AP, Zhang ZM, Tinker LF, Liao D, Laurie CC, Gogarten SM, Lin HJ, Brody JA, Bartz TM, Psaty BM, Sotoodehnia N, Soliman EZ, Avery CL, Whitsel EA
- Issue date: 2018 Apr 4