[Corpora-List] All English Text Messaging Corpus?

Rich Cooper rich at englishlogickernel.com
Sat Apr 9 22:41:45 CEST 2011


Hi Laura,

I don't know of any text message sources exactly like what your are describing. But there is a huge, partially structured text database for US patent documents, nearly all in English I suppose, which have all been critiqued by expert examiners, as edited in the process of negotiating a patent claim set - all in English. You can create databases of patent documents on your desktop by downloading the free web client software Elk for Patents (EfP), which is built on the English Logic Kernel (Elk), as described in US Patent 7,209,923. The patent is posted on the web site as well. It teaches ways to combine corpus analysis methods with relational and object oriented database technologies. See my website to download and try the free program.

EnglishLogicKernel dot com

One advantage of choosing the patent database is that every document is constrained by the patenting process by experts in each patent's specific technologies, and the vocabulary of words defined modus ponens after careful debate and crafting of each claim sentence. For example, no really effective syntax parser for English has reached widespread usage, with the best of the performers being the Link Grammar Processor (LGP), IMHO. Using the vocabulary of non-noise words defined in patent claims, the English analyst can relate those claim words and phrases to specific objects as they have been described by sentences in the much more verbose specification part of the patent document. This provides an ideal, large, partially structured database and processing environment in which to analyze the English of claim language.

HTH, -Rich

Sincerely, Rich Cooper EnglishLogicKernel.com Rich AT EnglishLogicKernel DOT com 9 4 9 \ 5 2 5 - 5 7 1 2

-----Original Message----- From: corpora-bounces at uib.no [mailto:corpora-bounces at uib.no] On Behalf Of Christopherson, Laura Sent: Saturday, April 09, 2011 12:35 PM To: corpora at uib.no Subject: [Corpora-List] All English Text Messaging Corpus?

Hi All,

Do any of you know of a text messaging corpus only in English that is not a collection of someone's personal (and/or family/friends') messages?

Thanks, Laura

_______________________________________________ UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora Corpora mailing list Corpora at uib.no http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list