Morphochallenge http://research.ics.tkk.fi/events/morphochallenge2010/ is a competition every year or two, to develop unsupervised morphological analysers for Finnish and other languages. every year there are datsets including Finnish. I suggest you look at early years for
'normal written Finnish' -> 'morphologically decomposed Finnish'
as later, eg 2010, the challenge has got more demanding, asking for
'normal written Finnish' -> 'morphemes and feature-labels for finnish'
You should also contact the organisers direct, as they probably have the best ideas to guide you
hope this helps
Eric Atwell, Leeds University
On Tue, 3 Apr 2012, Carter, Simon wrote:
> Dear Corpora List,
> I am doing a bit of work on Finnish, and was wandering if anyone was aware
> of any data sets that have
> 'normal written Finnish' -> 'morphologically decomposed Finnish',
> preferably by a human. I am aware of some FSM's that can output many
> different such decompositions,
> but are there any human produced sets, maybe used for training/testing