[Corpora-List] Looking for terminology extraction gold standards

Mihael mihael.arcan at insight-centre.org
Wed Apr 26 11:50:08 CEST 2017


Hi Andraz,

we annotated once terms in a subset of the GNOME and KDE parallel documents in the IT domain, for English, German and Italian [1]. Take a look and let me know if it is helpful.

Regards, Mihael

[1] https://hlt-mt.fbk.eu/technologies/bittercorpus

On 25/04/17 18:25, Andraz Repar wrote:
> Hello,
>
> this is my first post to corporalist, so please be gentle:)
>
> I am looking for publicly available gold standard corpora for
> terminology extraction. Ideally, this would be a corpus where all
> terms have been annotated.
>
> I haven't been able to find any myself, and I realize this is probably
> a long shot. I would prefer European languages, but at this point I am
> not too picky and would take anything.
>
> Best regards,
> Andraž Repar
>
> International Postgraduate School Jožef Stefan, Ljubljana, Slovenia
>
>
> _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora

-- Dr. Mihael Arcan Postdoctoral Researcher at Unit for Natural Language Processing (UNLP) Insight Centre for Data Analytics @ NUI Galway http://nuig.insight-centre.org/unlp/people/members/mihael-arcan/

-------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 2589 bytes Desc: not available URL: <https://mailman.uib.no/public/corpora/attachments/20170426/fe6f74c8/attachment.txt>



More information about the Corpora mailing list