[Corpora-List] Ambiguous words in English and their frequency

Eckhard Bick eckhard.bick at mail.dk
Thu Jan 26 10:30:57 CET 2012


It depends on what you call ambiguity, of course, and whether it's types or tokens.

If ambiguity is meant to include in-lemma inflexion ambiguity such as participle vs. past tense, and infinitive versus finite verb, then a quick mini-run, using our Constraint Grammar analysis on Leipzig internet corpus data, yields an ambiguity of 2.11 readings per English word token, punctuation excluded.

Best regards, Eckhard

On 2012-01-25 20:33, FORT, Karen wrote:
> Hi all,
> I need to find this information (the proportion of ambiguous words in English and their frequency).
> For example, we know that in French 8% of the words represent 30% of the ambiguity.
> Of course, it's very rough, but it's only to have a rough idea.
> Can somebody help me with this (of course, I searched for a ref but could not find anything precise)?
> Thank you in advance,
> Regards,
> Karën FORT
> Ingénieure/Engineer et/and doctorante/PhD student
> 2, allée de Brabois
> 54500 Vandoeuvre-lès-Nancy
> France
> Bureau/Office: H112
> +33 (0)3 83 50 46 36
> http://www-lipn.univ-paris13.fr/~fort/
> _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora

-- Eckhard Bick, cand.med., dr.phil. University of Southern Denmark e-mail: eckhard.bick at mail.dk web: http://beta.visl.sdu.dk

More information about the Corpora mailing list