[Corpora-List] Question about evaluation

Alexander Osherenko osherenko at gmx.de
Mon Dec 3 09:01:38 CET 2012

Dear Emad,

you would need a measure averaged over classes -- for example for the recall value, the number of correctly classified instances divided by the overall number of instances.


2012/12/2 Emad Mohamed <emohamed at umail.iu.edu>

> Hello Corpora members,
> I have a corpus of 80,000 words in which each word is assigned either the
> class S or the class E. Class S occurs 72,000 times while class E occurs
> 8,000 times only.
> I'm wondering what the best way to evaluate the classifier performance
> should be. I have randomly selected a dev set (5%) and a test set (10%).
> I'm mainly interested in predicting which words are class E.
> I've read this page:
> webdocs.cs.ualberta.ca/~eisner/measures.html
> but I'm still a little bit confused. Do we use specificity in linguistics
> papers? Should I report these measures for each of the two classes or a as
> a general number? Does this make sense / a difference?
> Thank you so much.
> --
> Emad Mohamed
> aka Emad Nawfal
> Université du Québec à Montréal
> _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora

-- Alexander Osherenko Dr. rer. nat, CEO and R&D <http://www.socioware.de/> -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 2238 bytes Desc: not available URL: <https://mailman.uib.no/public/corpora/attachments/20121203/afe01f7c/attachment.txt>

More information about the Corpora mailing list