[Corpora-List] the ebb and flow of inclusion of words in OED?

John F. Sowa sowa at bestweb.net
Tue Apr 26 17:59:55 CEST 2011


I'd just like to make one comment on the following point:

On 4/26/2011 10:28 AM, Martin Reynaert wrote:

> These are, of course, simple 's' to 'a' OCR-misrecognition errors...

I hope and expect OCR systems (and spelling correction methods used with them) to make major improvements over the next few years. That implies that the scanned versions of all those documents will have to go through another OCR process. (I hope they saved the scanned images.)

John



More information about the Corpora mailing list