Show simple item record

dc.contributor.authorXu, D.
dc.contributor.authorBethard, S.
dc.date.accessioned2022-03-17T01:56:58Z
dc.date.available2022-03-17T01:56:58Z
dc.date.issued2021
dc.identifier.citationXu, D., & Bethard, S. (2021, June). Triplet-Trained Vector Space and Sieve-Based Search Improve Biomedical Concept Normalization. In Proceedings of the 20th Workshop on Biomedical Language Processing (pp. 11-22).
dc.identifier.isbn9781954085404
dc.identifier.doi10.18653/v1/2021.bionlp-1.2
dc.identifier.urihttp://hdl.handle.net/10150/663578
dc.description.abstractConcept normalization, the task of linking textual mentions of concepts to concepts in an ontology, is critical for mining and analyzing biomedical texts. We propose a vector-space model for concept normalization, where mentions and concepts are encoded via transformer networks that are trained via a triplet objective with online hard triplet mining. The transformer networks refine existing pre-trained models, and the online triplet mining makes training efficient even with hundreds of thousands of concepts by sampling training triples within each mini-batch. We introduce a variety of strategies for searching with the trained vector-space model, including approaches that incorporate domain-specific synonyms at search time with no model retraining. Across five datasets, our models that are trained only once on their corresponding ontologies are within 3 points of state-of-the-art models that are retrained for each new domain. Our models can also be trained for each domain, achieving new state-of-the-art on multiple datasets. © 2021 Association for Computational Linguistics
dc.language.isoen
dc.publisherAssociation for Computational Linguistics (ACL)
dc.rightsCopyright © 2021 Association for Computational Linguistics. Licensed on a Creative Commons Attribution 4.0 International License.
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/
dc.titleTriplet-Trained Vector Space and Sieve-Based Search Improve Biomedical Concept Normalization
dc.typeProceedings
dc.typetext
dc.contributor.departmentSchool of Information, University of Arizona
dc.identifier.journalProceedings of the 20th Workshop on Biomedical Language Processing, BioNLP 2021
dc.description.noteOpen access journal
dc.description.collectioninformationThis item from the UA Faculty Publications collection is made available by the University of Arizona with support from the University of Arizona Libraries. If you have questions, please contact us at repository@u.library.arizona.edu.
dc.eprint.versionFinal published version
dc.source.journaltitleProceedings of the 20th Workshop on Biomedical Language Processing, BioNLP 2021
refterms.dateFOA2022-03-17T01:56:58Z


Files in this item

Thumbnail
Name:
2021bionlp_1_2.pdf
Size:
374.5Kb
Format:
PDF
Description:
Final Published Version

This item appears in the following Collection(s)

Show simple item record

Copyright © 2021 Association for Computational Linguistics. Licensed on a Creative Commons Attribution 4.0 International License.
Except where otherwise noted, this item's license is described as Copyright © 2021 Association for Computational Linguistics. Licensed on a Creative Commons Attribution 4.0 International License.