[Corpora-List] Corpora, annotation schema/guidelines and NER systems for LOCATION or geonames

Lenardič, Jakob Jakob.Lenardic at ff.uni-lj.si
Fri Oct 29 10:36:55 CEST 2021


Hi Salvador,

at CLARIN ERIC https://www.clarin.eu/, we have an overview of existing Named Entity Recognition tools in our infrastructure, most of which recognize Location entitites: https://www.clarin.eu/resource-families/tools-named-entity-recognition. Here, you'll find quite a few tools for English, German, Spanish, etc.

If you're interested in components such as language models, for the NER tool NameTag 2.0, there exist downloadable models for Czech as well as English, see here: http://hdl.handle.net/11234/1-3773

Best, Jakob

--- Dr Jakob Lenardič Researcher, Department of Translation Faculty of Arts, University of Ljubljana jakob.lenardic at ff.uni-lj.si ________________________________ Od: corpora-bounces at uib.no <corpora-bounces at uib.no> v imenu Salvador Lima <salvador.limalopez at gmail.com> Poslano: četrtek, 28. oktober 2021 17:53 Za: corpora at uib.no <corpora at uib.no> Zadeva: [Corpora-List] Corpora, annotation schema/guidelines and NER systems for LOCATION or geonames

Dear all,

We are trying to collect a more comprehensive view on the current NLP resources related to the annotation, automatic recognition, and normalization/grounding of LOCATION or geonames/places related entity types (for data in English, and particularly also other languages).

We did have a look at the ENAMEX tagset (Location and sub-tags) and guidelines, ACE and CLIA.

We would really appreciate feedback on current NER and entity linking components, corpora, and also annotation guidelines for different languages, including English, Spanish, Italian, French, German, Portuguese, or Swedish. Anything with a special focus on movements and travels would also be really interesting.

Best regards,

-- Salvador Lima Lopez RESEARCH ENGINEER Life Sciences - Text Mining, BSC-CNS Barcelona, Spain -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 4660 bytes Desc: not available URL: <https://mailman.uib.no/public/corpora/attachments/20211029/70e1969b/attachment.txt>



More information about the Corpora mailing list