[Corpora-List] SIGMORPHON 2022 Shared Tasks

SIG MORPHON sigmorphon at gmail.com
Mon Mar 21 17:41:27 CET 2022


SIGMORPHON is soliciting participation in the following shared tasks for its 2022 workshop, co-located with NAACL in July:

Generalization in Morphological Inflection Generation:

Inflection is the task of taking a dictionary word, such as 'run', and changing its form so that it represents grammatical information such as the subject, the tense, etc. For example, the Present Continuous form of 'run' is "running". Many languages have very extensive inflection, which creates significant data sparsity problems for computational algorithms. Following in the tradition of several tasks that have established the state of the art in inflection, this year's task will investigate the generalization capabilities of proposed architectures. See https://github.com/sigmorphon/2022InflectionST for more details.

Low-resource grapheme-to-phoneme prediction

Spelling is an approximation of the pronunciation of words, with different languages using different strategies to represent the sounds present in the language. This year's task solicits contributions that can learn the pronunciation of words from very small corpora (no more than 1000 words), and generalize to closely-related languages. See https://github.com/sigmorphon/sigmorphon.github.io/blob/master/sharedtasks/2022/G2P.md for more details.

Morpheme Segmentation.

Many computational algorithms use subword methods to bridge the gap between word-based models and character-based ones. However, the subword representations (such as BPE, etc.) are not linguistically motivated - they are established based on sequence co-occurrences. This task is soliciting participants that will learn models that more closely mimic linguistically-annotated segmentation schemes, at both the word and sentence level. See https://github.com/sigmorphon/2022SegmentationST for more details. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 2277 bytes Desc: not available URL: <https://mailman.uib.no/public/corpora/attachments/20220321/4e252961/attachment.txt>



More information about the Corpora mailing list