Comparison of evaluation metrics of deep learning for imbalanced imaging data in osteoarthritis studies
Name:
1-s2.0-S1063458423007951.pdf
Size:
1.546Mb
Format:
PDF
Description:
Final Published Version
Author
Liu, S.Roemer, F.
Ge, Y.
Bedrick, E.J.
Li, Z.-M.
Guermazi, A.
Sharma, L.
Eaton, C.
Hochberg, M.C.
Hunter, D.J.
Nevitt, M.C.
Wirth, W.
Kent, Kwoh, C.
Sun, X.
Affiliation
Department of Epidemiology and Biostatistics, University of ArizonaDepartment of Management Information Systems, University of Arizona
University of Arizona Arthritis Center, University of Arizona
Issue Date
2023-09Keywords
Bone marrow lesionDeep learning
Imbalanced data
Osteoarthritis
Precision recall curve
Receiver operating characteristic
Metadata
Show full item recordPublisher
W.B. Saunders LtdCitation
Liu, Shen, et al. "Comparison of evaluation metrics of deep learning for imbalanced imaging data in osteoarthritis studies." Osteoarthritis and Cartilage 31.9 (2023): 1242-1248.Journal
Osteoarthritis and CartilageRights
© 2023 The Author(s). Published by Elsevier Ltd on behalf of Osteoarthritis Research Society International. This is an open access article under the CC BY-NC-ND license (https://creativecommons.org/licenses/by-nc-nd/4.0/).Collection Information
This item from the UA Faculty Publications collection is made available by the University of Arizona with support from the University of Arizona Libraries. If you have questions, please contact us at repository@u.library.arizona.edu.Abstract
Purpose: To compare the evaluation metrics for deep learning methods that were developed using imbalanced imaging data in osteoarthritis studies. Materials and methods: This retrospective study utilized 2996 sagittal intermediate-weighted fat-suppressed knee MRIs with MRI Osteoarthritis Knee Score readings from 2467 participants in the Osteoarthritis Initiative study. We obtained probabilities of the presence of bone marrow lesions (BMLs) from MRIs in the testing dataset at the sub-region (15 sub-regions), compartment, and whole-knee levels based on the trained deep learning models. We compared different evaluation metrics (e.g., receiver operating characteristic (ROC) and precision-recall (PR) curves) in the testing dataset with various class ratios (presence of BMLs vs. absence of BMLs) at these three data levels to assess the model's performance. Results: In a subregion with an extremely high imbalance ratio, the model achieved a ROC-AUC of 0.84, a PR-AUC of 0.10, a sensitivity of 0, and a specificity of 1. Conclusion: The commonly used ROC curve is not sufficiently informative, especially in the case of imbalanced data. We provide the following practical suggestions based on our data analysis: 1) ROC-AUC is recommended for balanced data, 2) PR-AUC should be used for moderately imbalanced data (i.e., when the proportion of the minor class is above 5% and less than 50%), and 3) for severely imbalanced data (i.e., when the proportion of the minor class is below 5%), it is not practical to apply a deep learning model, even with the application of techniques addressing imbalanced data issues. © 2023 The Author(s)Note
Open access articleISSN
1063-4584PubMed ID
37209993Version
Final Published Versionae974a485f413a2113503eed53cd6c53
10.1016/j.joca.2023.05.006
Scopus Count
Collections
Except where otherwise noted, this item's license is described as © 2023 The Author(s). Published by Elsevier Ltd on behalf of Osteoarthritis Research Society International. This is an open access article under the CC BY-NC-ND license (https://creativecommons.org/licenses/by-nc-nd/4.0/).
Related articles
- Development of a Magnetic Resonance Imaging-Based Definition of Knee Osteoarthritis: Data From the Multicenter Osteoarthritis Study.
- Authors: Liew JW, Rabasa G, LaValley M, Collins J, Stefanik J, Roemer FW, Guermazi A, Lewis CE, Nevitt M, Torner J, Felson D
- Issue date: 2023 Jul
- Knee tissue lesions and prediction of incident knee osteoarthritis over 7 years in a cohort of persons at higher risk.
- Authors: Sharma L, Hochberg M, Nevitt M, Guermazi A, Roemer F, Crema MD, Eaton C, Jackson R, Kwoh K, Cauley J, Almagor O, Chmiel JS
- Issue date: 2017 Jul
- Patellofemoral morphology measurements and their associations with tibiofemoral osteoarthritis-related structural damage: exploratory analysis on the osteoarthritis initiative.
- Authors: Haj-Mirzaian A, Guermazi A, Pishgar F, Roemer FW, Sereni C, Hakky M, Zikria B, Demehri S
- Issue date: 2020 Jan
- Synovitis mediates the association between bone marrow lesions and knee pain in osteoarthritis: data from the Foundation for the National Institute of Health (FNIH) Osteoarthritis Biomarkers Consortium.
- Authors: Wang X, Chen T, Liang W, Fan T, Zhu Z, Cao P, Ruan G, Zhang Y, Chen S, Wang Q, Li S, Huang Y, Zeng M, Hunter DJ, Li J, Ding C
- Issue date: 2022 Sep
- Comparison of BLOKS and WORMS scoring systems part I. Cross sectional comparison of methods to assess cartilage morphology, meniscal damage and bone marrow lesions on knee MRI: data from the osteoarthritis initiative.
- Authors: Lynch JA, Roemer FW, Nevitt MC, Felson DT, Niu J, Eaton CB, Guermazi A
- Issue date: 2010 Nov