[Corpora-List] Spellchecker evaluation corpus

John F. Sowa sowa at bestweb.net
Mon Apr 11 04:05:46 CEST 2011


On 4/9/2011 7:03 AM, Eric Atwell wrote:
> Jennifer Pedler's PhD developed a spelling-error detection tool,
> evaluated on a corpus of real spelling errors;

Her slides had an example from a UK corpus that would be highly unlikely in the US: {tort, taught}.

Japanese English is much better than it used to be, but it still has L/R confusions.

Instead of a single corpus, it would be useful to have a set of corpora for authors with different backgrounds.

John Sowa



More information about the Corpora mailing list