[Corpora-List] All English Text Messaging Corpus?

Angus B. Grieve-Smith grvsmth at panix.com
Mon Apr 11 16:06:15 CEST 2011

On 4/11/2011 9:43 AM, Khurshid Ahmad wrote:
> Dear Laura
> I am writing to support Rich.
> The USPTO documents are in Legal English and are written by Patent
> Attorneys -these documents form a representative sample of American
> English and to a lesser extent that of other national varieties of written
> English.

Uh, people? Laura asked for a text messaging corpus. As in short, presumably spontaneous messages. I think her query was very interesting, because we know that people use text and chat for many purposes besides personal ones. Such a corpus (preferably with timing data) would enable us to differentiate between variation due to formality and variation due to spontaneity and time pressure.

Rich hijacked that query to promote patents (which is a completely different text type) and his own software project (which is not remotely necessary to analyze the publicly available patent database). Of course patents are worthwhile objects of study, but they're not text messages, and Rich's response was essentially spam.

If anyone wants to promote their own project or suggest an area that deserves further study, go ahead, but start your own thread.


-Angus B. Grieve-Smith

grvsmth at panix.com

More information about the Corpora mailing list