[Corpora-List] Constitution

John F. Sowa sowa at bestweb.net
Sun May 15 15:41:00 CEST 2005


These issues are typical of a large number of applications.
The current discussion is about EU documents, but very
similar issues arise in commercial applications.

For example, a multinational company X, which has its roots
in country A, will typically have most documents written in
the language of A. Large numbers of new terms that refer to
X's products and their novel features will be coined in terms
of the A lexicon, and corresponding terms will have to be
coined in each of the languages for each of the markets
to which X sells their products. Furthermore, the product
developers will constantly be extending the terminology,
coining new terms, and modifying the meanings of old terms
even while terminologists are trying to define equivalents.

Replace the term "EU" with "market" and the same problems
exist in commercial applications:

> I'm not sure that corpus methods are very relevant.
> For new and newish EU languages, the corpora don't exist
> yet or where they do, are likely to be full of
> inconsistencies and not dependable.

The denotation of the term "corpus methods" changes with
every new method that anyone invents. If the current methods
can't handle problems of this sort, that's an important
challenge for researchers to develop new ones.

John Sowa

More information about the Corpora-archive mailing list