[Corpora-List] Multiple category assignement

Aliabbas Petiwala aliabbasjp at gmail.com
Sun Aug 25 16:55:04 CEST 2013


For the task of building a humanly annotated corpora:

There are annotation tasks where the items belong to multiple categories and annotators have to mark each category to which the item belongs.

e.g: the same coder c1 assigns the two categories (v1,v2) to the item '1'

task = AnnotationTask(data=[(‘c1’, ‘1’, ‘v1’),(‘c1’, ‘1’, ‘v2’),...])

So should such multiple categories be represented as bitstrings , such that for n categories there would be a whopping 2^n assignments ? This would surely make the inter annotator agreement (IAA) scores very low for minor differences.

So what is the best way to compute annotation agreement for tasks that require multiple assignment to an item? And how to represent categories for such cases? -- *

* -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 2928 bytes Desc: not available URL: <https://mailman.uib.no/public/corpora/attachments/20130825/435d583a/attachment.txt>



More information about the Corpora mailing list