you should find some useful material in the Proceedings of the Hyderabad workshop on Noisy Texts: http://research.ihost.com/and2007/cd/Proceedings_files/toc.htm
The Stubbe-Ringlstetter-Schulz paper seems a good starting point: http://research.ihost.com/and2007/cd/Proceedings_files/p9.pdf
Good luck,
Mirko Tavosanis Dipartimento di Studi italianistici Universita' di Pisa
-- No virus found in this outgoing message. Checked by AVG Free Edition. Version: 7.5.516 / Virus Database: 269.19.19/1256 - Release Date: 02/02/2008 13.50