[Corpora-List] ON using a subject in the SUBJECT line

Khurshid Ahmad kahmad
Thu Jan 17 21:43:29 CET 2013


Please folks use the subject line. I have to open every mail from Corpus List as I am not sure whether the mail is relevant or not

On 17-01-2013 20:18, maxwell wrote:
> On 2013-01-17 09:57, Eirini LS wrote:
>> I mean that I have two different scripts for the same word (e.g. two
>> scripts for "cat") written by different people. The first script
>> generates 358 words (and only 107 words are correct), and the second
>> script generates 497 words (and 471 words are correct). Can I say
>> that
>> the result of the first script is worse or not?
>
> Clearly the recall and precision on the second script are higher. Of
> course, without knowing what the total number of words that should be
> generated is, it's hard to say more. In particular, it's hard to say
> whether 471 is good. (Is the second script getting 471 out of 500
> possible, or 471 out of 50,000?)
>
> In general, though, I think comparing at this gross level is only
> going to give a general sort of answer. What you really want is a
> test set where each input word is paired with its expected output
> word, so you can do error analysis and regression testing.
>
> Mike Maxwell
>
> _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora

-- Best wishes

Khurshid Ahmad. PhD, FBCS, FTCD, CITP Professor of Computer Science School of Computer Science and Statistics Trinity College Dublin 2 IRELAND

Phone: 00353 1 896 8429 (Labs: 00 353 1 8968435) Fax 353 1 677 2204 Webpage: www.cs.tcd.ie/khurshid.ahmad



More information about the Corpora mailing list