[Corpora-List] Coefficients in SVM models

Thomas Proisl thomas.proisl at fau.de
Fri Aug 21 09:17:57 CEST 2015

Dear Ken,

Recursive Feature Elimination (the method used by Guyon et al.), i.e. going through cycles of retraining the model and repeatedly removing the features with the lowest absolute or squared feature weights, is a good way of arriving at a small “optimal” set of features. Since the number of features you have is several orders of magnitude larger than the number of samples, you should test that feature set on some held-out data to rule out excessive overtraining.

If you have a lot of features based on relatively few resources, you could also systematically try out all combinations of resources to see the impact of any resource. But as there are 2^n-1 possible combinations for n resources, this is not tractable for a larger number of feature groups.

Since Support Vector Machines are not scale-invariant, you should also do some feature scaling in your preprocessing (you are probably already doing this), e.g. standardizing (zero mean and unit variance) or rescaling (to [0,1]) the values for each feature. Otherwise the order of magnitude of the individual features distorts the feature weights. If your feature matrix is rather sparse, rescaling might work better than standardizing because it preserves sparsity.

Best regards, Thomas

Am Thu, 20 Aug 2015 14:36:23 -0400 schrieb Ken Litkowski <ken at clres.com>:

> Are there any methods in computational linguistics for interpreting
> the coefficients in SVM models? I have 120 nice preposition
> disambiguation models developed using the Tratz-Hovy parser, with an
> average of about 15,000 features for each preposition. I'd like to
> identify the significant features (hopefully lexicographically
> salient). One such method (implemented in Weka) is to square the
> coefficients and to use this as the basis for ranking the features
> (the source of this method being a classic study by Guyon et al.,
> 2002, in gene selection for cancer classification using support
> vector machines
> <http://link.springer.com/article/10.1023/A:1012487302797>). I'm
> extending these models (which make heavy use of WN) with other
> lexical resources, including FN, VN, and CPA. This will make the
> feature space even more hyperdimensional, so I'd like to pare them
> back in a principled way so I can see the potential contribution of
> these other resources.
> Thanks,
> Ken

-- FAU Erlangen-Nürnberg Department Germanistik und Komparatistik Professur für Korpuslinguistik Bismarckstr. 6, 91054 Erlangen

Fon: +49 9131 85-25908; Fax: +49 9131 85-29251 http://www.linguistik.fau.de/~tsproisl/ -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 198 bytes Desc: not available URL: <https://mailman.uib.no/public/corpora/attachments/20150821/10f59e0a/attachment.asc>

More information about the Corpora mailing list