Modeling semantic coherence from corpus data: the fact and the frequency of a co-occurrence
dc.contributor.author | Pekar, Viktor | |
dc.date.accessioned | 2011-03-31T18:02:32Z | |
dc.date.available | 2011-03-31T18:02:32Z | |
dc.date.issued | 2001 | |
dc.identifier.issn | 0894-4539 | |
dc.identifier.uri | http://hdl.handle.net/10150/126619 | |
dc.description | Published as Coyote Papers: Working Papers in Linguistics, Language in Cognitive Science | en_US |
dc.description.abstract | The paper presents a preliminary evaluation of a corpus-based representation of individual words and a method to generalize over these representations. The vector space is represented in a way that gives weight to the fact that words co-occur rather than to the frequency of their co-occurrence. This format is hypothesized to allow for reducing the vector space, minimizing negative effects of data sparseness and enhancing ability of the model to generalize words to novel contexts. The model is assessed by comparing computer-calculated probabilities of different verb-argument combinations with human subjects' judgements about appropriateness of these combinations. The results indicate that there is a correlation between the probabilities calculated by the model and the subjects' evaluations. | |
dc.language.iso | en_US | en_US |
dc.publisher | University of Arizona Linguistics Circle (Tucson, Arizona) | en_US |
dc.relation.url | https://coyotepapers.sbs.arizona.edu/ | en_US |
dc.rights | Copyright © is held by the author(s). | en_US |
dc.rights.uri | http://rightsstatements.org/vocab/InC/1.0/ | en_US |
dc.title | Modeling semantic coherence from corpus data: the fact and the frequency of a co-occurrence | en_US |
dc.type | text | en_US |
dc.type | Article | en_US |
dc.contributor.department | Bashkir State University | en_US |
dc.identifier.journal | Coyote Papers | en_US |
dc.description.collectioninformation | The Coyote Papers are made available by the Arizona Linguistics Circle at the University of Arizona and the University of Arizona Libraries. Contact coyotepapers@email.arizona.edu with questions about these materials. | en_US |
dc.source.journaltitle | Coyote Papers | |
refterms.dateFOA | 2018-06-12T10:51:13Z | |
html.description.abstract | The paper presents a preliminary evaluation of a corpus-based representation of individual words and a method to generalize over these representations. The vector space is represented in a way that gives weight to the fact that words co-occur rather than to the frequency of their co-occurrence. This format is hypothesized to allow for reducing the vector space, minimizing negative effects of data sparseness and enhancing ability of the model to generalize words to novel contexts. The model is assessed by comparing computer-calculated probabilities of different verb-argument combinations with human subjects' judgements about appropriateness of these combinations. The results indicate that there is a correlation between the probabilities calculated by the model and the subjects' evaluations. |