[Corpora-List] Word-sentence relatedness data set for evaluating semantic models

Wolmetz, Michael E. Michael.Wolmetz at jhuapl.edu
Wed Apr 6 23:29:17 CEST 2016


We recently posted a data set for evaluating semantic textual similarity (STS) models on arXiv:

Glasgow K, Roos M, Haufler A, Chevillet M, Wolmetz, M (2016). Evaluating semantic models with word-sentence relatedness. arXiv:1603.07253<http://arxiv.org/abs/1603.07253> [cs.CL]

The data set consists of 775 English word-sentence pairs, each annotated for semantic relatedness by human raters. As a sample application of this relatedness data, behavior-based relatedness was compared to the relatedness computed via four off-the-shelf STS models: n-gram, LSA, Word2Vec, and UMBC Ebiquity. All text stimuli and judgment data are downloadable as ancillary files. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 2751 bytes Desc: not available URL: <https://mailman.uib.no/public/corpora/attachments/20160406/1a6ccf56/attachment.txt>



More information about the Corpora mailing list