[Corpora-List] Coefficients in SVM models

Hady elsahar hadyelsahar at gmail.com
Fri Aug 21 09:54:28 CEST 2015

Dear Ken,

Recursive feature elimination one method of course, but if your training takes a lot of time it might be very hard to repeat it many times for feature elimination. One other method to investigate is using support vector machines regularized with the L1 regularization penalty. The L1 penalty leads to sparse weight vectors for decision boundaries, which makes most coefficients of insignificant features exactly equal to zero, hence you can easily eliminate most on the unnecessary features after one training iteration.


On Fri, Aug 21, 2015 at 9:17 AM, Thomas Proisl <thomas.proisl at fau.de> wrote:

> Dear Ken,
> Recursive Feature Elimination (the method used by Guyon et al.), i.e.
> going through cycles of retraining the model and repeatedly removing the
> features with the lowest absolute or squared feature weights, is a good
> way of arriving at a small “optimal” set of features. Since the number
> of features you have is several orders of magnitude larger than the
> number of samples, you should test that feature set on some held-out
> data to rule out excessive overtraining.
> If you have a lot of features based on relatively few resources, you
> could also systematically try out all combinations of resources to see
> the impact of any resource. But as there are 2^n-1 possible combinations
> for n resources, this is not tractable for a larger number of feature
> groups.
> Since Support Vector Machines are not scale-invariant, you should also
> do some feature scaling in your preprocessing (you are probably already
> doing this), e.g. standardizing (zero mean and unit variance) or
> rescaling (to [0,1]) the values for each feature. Otherwise the order of
> magnitude of the individual features distorts the feature weights. If
> your feature matrix is rather sparse, rescaling might work better than
> standardizing because it preserves sparsity.
> Best regards,
> Thomas
> Am Thu, 20 Aug 2015 14:36:23 -0400
> schrieb Ken Litkowski <ken at clres.com>:
> > Are there any methods in computational linguistics for interpreting
> > the coefficients in SVM models? I have 120 nice preposition
> > disambiguation models developed using the Tratz-Hovy parser, with an
> > average of about 15,000 features for each preposition. I'd like to
> > identify the significant features (hopefully lexicographically
> > salient). One such method (implemented in Weka) is to square the
> > coefficients and to use this as the basis for ranking the features
> > (the source of this method being a classic study by Guyon et al.,
> > 2002, in gene selection for cancer classification using support
> > vector machines
> > <http://link.springer.com/article/10.1023/A:1012487302797>). I'm
> > extending these models (which make heavy use of WN) with other
> > lexical resources, including FN, VN, and CPA. This will make the
> > feature space even more hyperdimensional, so I'd like to pare them
> > back in a principled way so I can see the potential contribution of
> > these other resources.
> >
> > Thanks,
> > Ken
> >
> --
> FAU Erlangen-Nürnberg
> Department Germanistik und Komparatistik
> Professur für Korpuslinguistik
> Bismarckstr. 6, 91054 Erlangen
> Fon: +49 9131 85-25908; Fax: +49 9131 85-29251
> http://www.linguistik.fau.de/~tsproisl/
> _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora

-- ------------------------------------------------- Hady El-Sahar Research Assistant Center of Informatics Sciences | Nile University <http://nileuniversity.edu.eg/> -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 5277 bytes Desc: not available URL: <https://mailman.uib.no/public/corpora/attachments/20150821/f2347fc8/attachment.txt>

More information about the Corpora mailing list