The Semeval2020 Task3: Predicting the (Graded) Effect of Context on Word Similarity is officially in the "evaluation phase". Participants can submit their results until the 12th of March.

The task asks participants to predict how annotators will score the similarity of two words when they are presented within the context of a short text (two different short texts per pair). Four languages are included (English, Croatian, Slovene and Finnish):


The participants have access now to the "test data" which is made of the target pairs of words and two different contexts for each pair. The human annotation results are hidden from them, but they get their scores when they make a submission through the system (max submissions 9).

The "Evaluation Kit" (attached) includes the test data, a baseline created with multilingual BERT, a python script to replicate the baseline and a pdf with the instructions we gave to the annotators and an example of how the annotation surveys looked like. We have 340 English pairs, 112 Croatian, 111 Slovene and 24 Finnish.


The "Practice Phase" is still open to submissions (no max limit). We updated the "Practice Kit" (attached) with a new "trial data" that contains 10 English pairs, 5 Croatian and 5 Slovene, randomly selected from the real dataset. These do include the human annotation results so participants can have a look a the evaluation process (the actual evaluation scripts are included) and get some preliminary feedback for their models.


Thanks and best of luck!

Carlos, Matt & the EMBEDDIA project team http://embeddia.eu/

