[Corpora-List] querying corpora

Albretch Mueller lbrtchx at gmail.com
Fri Feb 29 15:25:58 CET 2008


I was wondering about the kinds of queries you may run on open corpora out there ~

Let me explain myself with a convoluted example: ~

Could you, say, run a query asking a corpus to give you the result about how many times, where in a sentence (both, as a distribution of the number of words, the POS elements used in them and the texts as a whole) did Shakespeare use words related to "love" (which you should be also able to query even with a certain level of "measurable relatedness") modified by an adverb and containing also an adjective within the sentence? ~

Are there studies on "queriability" of corpora regarding depth (look above), accuracy, speed and other performance features? ~

Are there any text corpora out there including phonemes also? ~

How would a data retrieval standard like SQL help in outlining a standard for text retrieval? ~

Thanks

lbrtchx



More information about the Corpora mailing list