[Corpora-List] All English Text Messaging Corpus?

Rich Cooper rich at englishlogickernel.com
Mon Apr 11 18:43:20 CEST 2011


Hi Trevor,

Yes, the PTO backlog is estimated at about one million applications unprocessed. The director's meaning was about the way his funds have been diverted from the PTO examiner staff to congress's political slush funds.

English is English, whether in text messaging or in other forms of corpora, so while the text messaging corpus may be useful to study for those purposes, the real issue is how English is structured in actual usage. The PTO has documents that were very carefully reviewed, which is almost certainly not the case in text messaging. Therefore it makes a great corpus for finding specialized language as used to describe reality within the specification of the patent.

JMHO, -Rich

Sincerely, Rich Cooper EnglishLogicKernel.com Rich AT EnglishLogicKernel DOT com 9 4 9 \ 5 2 5 - 5 7 1 2

-----Original Message----- From: corpora-bounces at uib.no [mailto:corpora-bounces at uib.no] On Behalf Of Trevor Jenkins Sent: Monday, April 11, 2011 5:42 AM To: Corpora list Subject: Re: [Corpora-List] All English Text Messaging Corpus?

On Sat, 9 Apr 2011, Rich Cooper <rich at englishlogickernel.com> wrote:


> Hi Laura,
>
> I don't know of any text message sources exactly like what your are
> describing. But there is a huge, partially structured text database for
US
> patent documents, nearly all in English I suppose, ...

Surely a highly specific genre of English; all about claims and prior art. Nothing like English as she is spoke by the people likely to be sending text messages.


> One advantage of choosing the patent database is that every document is
> constrained by the patenting process by experts in each patent's specific
> technologies, ...

Um, not so true. The (possibly once) head of the USPO admitted that many dubious patents were being granted because the organisation did not have the necessary expertise to evaluate the claims. This comment was specifically made in relation to software patents, which are in any case a highly contenious area.

Regards, Trevor

<>< Re: deemed!

_______________________________________________ UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora Corpora mailing list Corpora at uib.no http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list