[Corpora-List] A new set of word embeddings with state-of-the-art word similarity (simLex999) results

Roi Reichart roireichart at gmail.com
Wed Aug 5 19:33:50 CEST 2015


Hello again,

Manaal Faruqui rightfully pointed out that our numbers are the best on simLex999 for a model that uses only distributional information.

The following paper shows better numbers for a model that uses knowledge bases:

http://rd.springer.com/chapter/10.1007/978-3-319-18111-0_25#page-1

And this paper does slightly better than us when using linguistic resources (such as WordNet and FrameNet):

http://www.cs.cmu.edu/~mfaruqui/papers/acl15-nondist.pdf

Best, Roi

On Wed, Aug 5, 2015 at 5:43 PM, Roi Reichart <roireichart at gmail.com> wrote:


> Greetings,
>
> We are happy to announce the release of a new set of word embeddings,
> based on symmetric patterns automatically acquired from unannoated
> text. Our embeddings, described in the paper:
>
> Symmetric Pattern Based Word Embeddings for Improved Word Similarity
> Prediction, Roy Schwartz, Roi Reichart and Ari Rappoport. CoNLL 2015
>
> achieve, to the best of our knowledge, the best published result on
> the word similarity prediction task with the simLex999 data set (Hill,
> Reichart and Korhonen, 2014). Moreover, for verb pairs from simLex999
> the new embeddings outperform any previously published set of
> embeddings with is a very large margin (details in the paper).
>
> The embeddings can be downloaded from:
>
> http://www.cs.huji.ac.il/~roys02/papers/sp_embeddings/sp_embeddings.html
>
> Please do not hesitate to contact us if you would like any further
> information.
>
> Best,
> Roy Schwartz, Roi Reichart and Ari Rappoport
>
-------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 2408 bytes Desc: not available URL: <https://mailman.uib.no/public/corpora/attachments/20150805/c57af9eb/attachment.txt>



More information about the Corpora mailing list