[Corpora-List] What are the terms used for sentence-, paragraph- and text-level analysis?

Alon Lischinsky alischinsky at gmail.com
Wed Feb 20 19:52:40 CET 2013

On 2013/2/20 Matías Guzmán <mortem.dei at gmail.com> wrote:

>>The question remains why do we need "fancy" expressions at all?
> Because linguistics can't possibly afford to have a clear and unified
> terminology

Whatever the value of that argument in general, it's certainly not applicable to this case: as far as I know, ‘lemma’ is the universally-used term for canonical (in lexicography), uninflected forms (in morphology and computational linguistics). You don't get much more unified than that.

As for clarity, I've never seen what's wrong with having a term of art for a concept that doesn't have an unambiguous correlate in everyday usage. ‘Word’ wouldn't work in this case, since the point of ‘lemma’ is grouping various inflected word-forms under a common heading. In fact, employing the everyday term can be a source of considerable confusion between the lay and the specialised meaning (as Arnold Zwicky has pointed out regarding ‘grammar’ [http://arnoldzwicky.wordpress.com/2012/02/22/its-all-grammar/] and Geoffrey Pullum regarding ‘passive’ [http://languagelog.ldc.upenn.edu/nll/?p=2922]).


More information about the Corpora mailing list