I am doing a bit of work on Finnish, and was wandering if anyone was aware of any data sets that have
'normal written Finnish' -> 'morphologically decomposed Finnish',
preferably by a human. I am aware of some FSM's that can output many different such decompositions, but are there any human produced sets, maybe used for training/testing purposes?
Simon -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 829 bytes Desc: not available URL: <https://mailman.uib.no/public/corpora/attachments/20120403/c46bac7e/attachment.txt>