[Corpora-List] Corpora, annotation schema/guidelines and NER systems for LOCATION or geonames

Leonhard Hennig leonhard.hennig at dfki.de
Fri Oct 29 13:16:09 CEST 2021

Hi Salvador,

for German, there is also a dataset provided by DFKI (https://github.com/DFKI-NLP/MobIE) with locations and location subtypes (e.g. streets) linked to OpenStreetMap.

Best, Leo

Am 28.10.21 um 17:53 schrieb Salvador Lima:
> Dear all,
> We are trying to collect a more comprehensive view on the current NLP
> resources related to the annotation, automatic recognition, and
> normalization/grounding of LOCATION or geonames/places related entity
> types (for data in English, and particularly also other languages).
> We did have a look at the ENAMEX tagset (Location and sub-tags) and
> guidelines, ACE and CLIA.
> We would really appreciate feedback on current NER and entity linking
> components, corpora, and also annotation guidelines for different
> languages, including English, Spanish, Italian, French, German,
> Portuguese, or Swedish. Anything with a special focus on movements and
> travels would also be really interesting.
> Best regards,
> --
> Salvador Lima Lopez
> Life Sciences - Text Mining, BSC-CNS
> Barcelona, Spain

-- Dr.-Ing. Leonhard Hennig

Senior Researcher, Speech & Language Technology DFKI Projektbüro Berlin Alt-Moabit 91c, D-10559 Berlin, Germany Phone +49-30-23895-1821 Office +49-30-23895-1800 E-Mail leonhard.hennig at dfki.de

----------------------------------------------------------- Deutsches Forschungszentrum für Künstliche Intelligenz GmbH Trippstadter Strasse 122, 67663 Kaiserslautern, Germany

Geschäftsführung: Prof. Dr. Antonio Krüger (Vorsitzender) Helmut Ditzer

Vorsitzender des Aufsichtsrats: Dr. Gabriël Clemens Amtsgericht Kaiserslautern, HRB 2313 -----------------------------------------------------------

-------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 3109 bytes Desc: not available URL: <https://mailman.uib.no/public/corpora/attachments/20211029/ca825ee7/attachment.txt>

More information about the Corpora mailing list