[Corpora-List] Coefficients in SVM models

Behrang Q. Zadeh me at atmykitchen.info
Fri Aug 21 10:58:30 CEST 2015


Hi Ken,

Depending on the type of SVM kernel employed in your task, you can also exploit (alpha-stable) random projections to construct a vector space of lowered dimensionality (and somehow bypass the dimension reduction process such as explained in your email). Use a random projection-based method, e.g., random indexing in l2-regularised spaces, to construct a low-dimensional vector space, and then use this space for the training and classification (e.g., as suggested in [1]).

Regards,

Behrang

[1] Sahlgren and Coster (2004). Using bag- of-concepts to improve the performance of support vector machines. URL: https://aclweb.org/anthology/C/C04/C04-1070.pdf

On Fri, Aug 21, 2015 at 10:23 AM, Maximilian Haeussler <max at soe.ucsc.edu> wrote:


> Hi Ken,
> random thought: in an environment like Weka, R or sklearn, you can change
> your classifier to a regression or decision tree based classifier by
> changing just a single line in your code. The weights of the regression and
> the decision tree are easy to interpret.
> You could use the regression for the analysis of the feature influence,
> while still doing the final classification with the SVM, (in case that the
> SVM is really far superior to the regression).
> You could use a lasso or elasticnet regressor and increase alpha to remove
> features, just like the L1 parameter for SMV's suggested by Hady.
>
> cheers
> Max
>
> On Thu, Aug 20, 2015 at 8:36 PM, Ken Litkowski <ken at clres.com> wrote:
>
>> Are there any methods in computational linguistics for interpreting the
>> coefficients in SVM models? I have 120 nice preposition disambiguation
>> models developed using the Tratz-Hovy parser, with an average of about
>> 15,000 features for each preposition. I'd like to identify the significant
>> features (hopefully lexicographically salient). One such method
>> (implemented in Weka) is to square the coefficients and to use this as the
>> basis for ranking the features (the source of this method being a classic
>> study by Guyon et al., 2002, in gene selection for cancer classification
>> using support vector machines
>> <http://link.springer.com/article/10.1023/A:1012487302797>). I'm
>> extending these models (which make heavy use of WN) with other lexical
>> resources, including FN, VN, and CPA. This will make the feature space even
>> more hyperdimensional, so I'd like to pare them back in a principled way so
>> I can see the potential contribution of these other resources.
>>
>> Thanks,
>> Ken
>>
>> --
>> Ken Litkowski TEL.: 301-482-0237
>> CL Research EMAIL: ken at clres.com
>> 9208 Gue Road Home Page: http://www.clres.com
>> Damascus, MD 20872-1025 USA Blog: http://www.clres.com/blog
>>
>>
>> _______________________________________________
>> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
>> Corpora mailing list
>> Corpora at uib.no
>> http://mailman.uib.no/listinfo/corpora
>>
>>
>
> _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
>
>
-------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 5116 bytes Desc: not available URL: <https://mailman.uib.no/public/corpora/attachments/20150821/0c9c1b6d/attachment.txt>



More information about the Corpora mailing list