you would need a measure averaged over classes -- for example for the recall value, the number of correctly classified instances divided by the overall number of instances.
2012/12/2 Emad Mohamed <emohamed at umail.iu.edu>
> Hello Corpora members,
> I have a corpus of 80,000 words in which each word is assigned either the
> class S or the class E. Class S occurs 72,000 times while class E occurs
> 8,000 times only.
> I'm wondering what the best way to evaluate the classifier performance
> should be. I have randomly selected a dev set (5%) and a test set (10%).
> I'm mainly interested in predicting which words are class E.
> I've read this page:
> but I'm still a little bit confused. Do we use specificity in linguistics
> papers? Should I report these measures for each of the two classes or a as
> a general number? Does this make sense / a difference?
> Thank you so much.
> Emad Mohamed
> aka Emad Nawfal
> Université du Québec à Montréal
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no
-- Alexander Osherenko Dr. rer. nat, CEO and R&D <http://www.socioware.de/> -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 2238 bytes Desc: not available URL: <https://mailman.uib.no/public/corpora/attachments/20121203/afe01f7c/attachment.txt>