[Corpora-List] All English Text Messaging Corpus?

Trevor Jenkins trevor.jenkins at suneidesis.com
Mon Apr 11 14:41:54 CEST 2011

On Sat, 9 Apr 2011, Rich Cooper <rich at englishlogickernel.com> wrote:

> Hi Laura,
> I don't know of any text message sources exactly like what your are
> describing. But there is a huge, partially structured text database for US
> patent documents, nearly all in English I suppose, ...

Surely a highly specific genre of English; all about claims and prior art. Nothing like English as she is spoke by the people likely to be sending text messages.

> One advantage of choosing the patent database is that every document is
> constrained by the patenting process by experts in each patent's specific
> technologies, ...

Um, not so true. The (possibly once) head of the USPO admitted that many dubious patents were being granted because the organisation did not have the necessary expertise to evaluate the claims. This comment was specifically made in relation to software patents, which are in any case a highly contenious area.

Regards, Trevor

<>< Re: deemed!

More information about the Corpora mailing list