[Corpora-List] unsupervised with semi-supervised

Taras Zagibalov T.Zagibalov at sussex.ac.uk
Wed Apr 23 20:19:19 CEST 2008


Hello and many thanks to all you who replied to my question regarding unsupervised and semi-supervised learning and provided me with interesting and valuable information. Still I found some definitions quite ambiguous and I would like to clarify a few things. I'd appreciate if you could share your ideas about the following examples: 1. System A uses one million labelled examples for training. It works iteratively and can also use information obtained for self-training. System B uses only 100 labelled examples and also can use information obtained from unlabelled data for improving performance. Does it mean that both of the systems are unsupervised? 2. System A uses a seed vocabulary consisting of 100000 items for some task. System B uses only two seeds. Are they both unsupervised? 3. System A uses a big set of rules (hundreds) describing a language. System B uses only two rules. Are they both of the same kind (supervised)?

More generally speaking, does the amount of manual input matter for deciding if a system is (un- /semi-) supervised?

Best regards, Taras Zagibalov University of Sussex



More information about the Corpora mailing list