> Hi all,
>
> As a disclaimer, I have not worked with any of the tokenizers. For the type
> of results originally reported, however, I do have a suggestion for a
> possible partial explanation, based on some experience with Spanish. There
> is a real stylistic rule in Spanish which makes speakers and especially
> writers avoid repeating the same 'content word' within the same or
> contiguous sentences or clauses, using instead a synonym or paraphrase.
... and the same is true for Italian.
Steve Coffey.
---------------------------------------------------------------- This message was sent using IMP, the Internet Messaging Program.