[Corpora-List] Russian lexical database

Francis Tyers ftyers at prompsit.com
Thu Jul 13 17:18:22 CEST 2017


El 2017-07-13 16:50, Matías Guzmán Naranjo escribió:
> Dearl all,
>
> I am looking for a data base for Russian nouns and and verbs
> containing all the cells in their paradigm, stress marking and
> inflection class. The closest I have found is the wiktionary entries
> for Russian forms, but it seems tricky to parse. The Zalizniak's
> dictionary found here:
> http://starling.rinet.ru/cgi-bin/response.cgi?root=%2fusr%2flocal%2fshare%2fstarling%2fmorpho&morpho=0&basename=morpho\zaliznia\dict&first=1
> doesn't seem to have stress information.
>
> Are there any other freely accessible resource out there?
>
> Thanks for your help!

Hey there, you can do it using UDAR, which is free/open-source.[1]

The process is fairly involved, but if you'd like instructions I can send you some.

Depending on the exact dataset you need, I may have some preprepared data (e.g. top-5000 noun-adj-verb lemmas in SynTagRus with full paradigms).

Fran

1. Reynolds, Robert and Francis Tyers. “Automatic word stress annotation of Russian unrestricted text.” In Nordic Conference of Computational Linguistics NODALIDA 2015, p. 173. 2015



More information about the Corpora mailing list