[Corpora-List] Finnish Morphological Analysis

Eric Atwell csc6ea at leeds.ac.uk
Tue Apr 3 21:12:52 CEST 2012


Morphochallenge http://research.ics.tkk.fi/events/morphochallenge2010/ is a competition every year or two, to develop unsupervised morphological analysers for Finnish and other languages. every year there are datsets including Finnish. I suggest you look at early years for

'normal written Finnish' -> 'morphologically decomposed Finnish'

as later, eg 2010, the challenge has got more demanding, asking for

'normal written Finnish' -> 'morphemes and feature-labels for finnish'

You should also contact the organisers direct, as they probably have the best ideas to guide you

hope this helps

Eric Atwell, Leeds University

On Tue, 3 Apr 2012, Carter, Simon wrote:

> Dear Corpora List,
> I am doing a bit of work on Finnish, and was wandering if anyone was aware
> of any data sets that have
> 'normal written Finnish' -> 'morphologically decomposed Finnish', 
> preferably by a human. I am aware of some FSM's that can output many
> different such decompositions,
> but are there any human produced sets, maybe used for training/testing
> purposes?
> Regards,
> Simon

More information about the Corpora mailing list