You might find these tools useful:
- https://stanfordnlp.github.io/stanfordnlp/ has POS tagging,
lemmatization, and dependency parsing models for Ukrainian trained on
Universal dependencies.
- https://github.com/kmike/pymorphy2 uses
https://github.com/brown-uk/dict_uk to output all possible lemmas and
parts of speech for a word in Ukrainian, but it doesn't disambiguate
- please install using `pip install git+
https://github.com/kmike/pymorphy2.git pymorphy2-dicts-uk`
- (It doesn't work for the Ukrainian language if you install it
directly via pip.)
- In case you need a simple tokenizer, you can use
https://github.com/lang-uk/tokenize-uk.
Best regards, Mariana Romanyshyn
чт, 21 лист. 2019 о 14:05 Vladimír Benko <vladimir.benko at juls.savba.sk> пише:
> Dear Daniel,
>
> You may want to try to train the TreeTagger yourself using the Ukrainian
> Treebank available from the Universal Dependencies site. Alternatively,
> you also can tag your corpus by UDPipe with the language model trained on
> that treebank.
>
> Best,
>
> Vlado B, 12:55
>
> Dear colleagues,
>
> Does anyone know if Ukrainian parameters exist for TreeTagger (there's no
> mention of them on the website), or if there's another tagger similar to
> TreeTagger that could add POS and Lemma tags to Ukrainian?
>
> Thanks in advance for any help.
>
> Best regards,
> --
> Daniel HENKEL <https://univ-paris8.academia.edu/DanielHENKEL>
>
> *Maître de Conférences (Linguistique et Traduction) UFR5 LLCE-LEA • EA1569
> TransCrit*
> Université Paris 8 Vincennes-St-Denis
>
>
> *“non si può stendere una tipologia delle traduzioni, ma al massimo una
> tipologia di diversi modi di tradurre, volta per volta negoziando il fine
> che ci si propone – e volta per volta scoprendo che i modi di tradurre sono
> più di quelli che sospettiamo.”* U. Eco
>
> _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing listCorpora at uib.nohttps://mailman.uib.no/listinfo/corpora
>
>
> --
> Vladimír Benko
>
> Slovak Academy of Sciences
> Ľ. Štúr Institute of Linguistics
> Panská 26, SK-81101 Bratislava
>
> Tel +421-2-54431762 Fax -54431756
>
> http://aranea.juls.savba.sk/guest/
> https://www.facebook.com/araneawebcorpora/
> _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no
> https://mailman.uib.no/listinfo/corpora
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: text/html
Size: 5184 bytes
Desc: not available
URL: <https://mailman.uib.no/public/corpora/attachments/20191211/f89ca81c/attachment.txt>