* English-German * English-Italian * English-Russian * German-Italian * German-Russian * Italian-Russian
To download, please visit http://lcl.uniroma1.it/similarity-datasets/
*References:*
José Camacho-Collados, Mohammad Taher Pilehvar and Roberto Navigli. *A Framework for the Construction of Monolingual and Crosslingual Word Similarity Datasets*. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics (ACL 2015), Beijing, China, 2015, pp. 1-7.
Ira Leviant and Roi Reichart. *Judgment Language Matters: Multilingual Vector Space Models for **Judgment Language Aware Lexical Semantics*. 2015, Preprint pubslished on arXiv. arxiv:1508.00106
On Wed, Aug 12, 2015 at 7:32 PM, Roi Reichart <roiri at ie.technion.ac.il> wrote:
> Greetings,
>
> We would like to announce the release of a new resource - multilingual
> WS353. This resource consists of translations of the WS353 word
> association data set to three languages: German, Italian and Russian.
> Each of the translated datasets is scored by 13 human judges (crowd
> workers) - all fluent speakers of its language. For consistency, we
> also collected human judgments for the original English corpus
> according to the same protocol applied to the other languages.
>
> This dataset allows to explore the impact of the "judgement language"
> (the language in which word pairs are presented to the human judges)
> on the resulted similarity scores and to evaluate vector space models
> on a truly multilingual setup (i.e. when both the training and the
> test data are multilingual).
>
> The translation and annotation process, as well as related research on
> the impact of judgment language are described in the paper:
>
> Judgment Language Matters: Multilingual Vector Space Models for
> Judgment Language Aware Lexical Semantics. 2015. Ira Leviant, Roi
> Reichart . Preprint pubslished on arXiv. arxiv:1508.00106
>
> The data and paper can be downloaded from the project page at:
>
> http://technion.ac.il/~irakr/MultilingualVSMdata.html
>
> We will soon release similar data for the simLex999 word similarity
> dataset.
>
> Please do not hesitate to contact Ira or myself with any question you
> may have regarding this data.
>
> Best,
> Roi Reichart
>
> _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: text/html
Size: 4064 bytes
Desc: not available
URL: <https://mailman.uib.no/public/corpora/attachments/20150818/8dc4d8f7/attachment.txt>