[Corpora-List] Imbalanced Text Classification

Koos Wilt kooswilt at gmail.com
Fri Jan 3 20:33:18 CET 2020


For the size imbalance, feature reduction should work welL. I discuss a method of picking features on the basis of grammatical relations ( SUBJECT VERB OBJECT) rather than choosing them by statistics. Works very well. If you carry this out, please let me know and thanks in advance.

https://www.academia.edu/27207951/Linguistics_improves_statistical_classification_with_KLD_NB_TF_IDF_K-NN_the_positive_effects_of_reducing_feature_dimensionality_or_grammatical_feature_selection._Koos_van_der_Wilt?fbclid=IwAR1nKkVRLnzzioBesfxSHbnUyF1preF55fyJCT2kzcMD9kAEx7td_oE8AEA

Op vr 3 jan. 2020 om 20:28 schreef Eugenio Martínez Cámara < emcamara at decsai.ugr.es>:


> Hi Koos,
>
> Imbalanced in training and test on the labels or sizes.
>
> Thanks.
>
> Kind regards,
> Eugenio.
>
> El 2020-01-03 20:14, Koos Wilt escribió:
>
> About a nice 2020: likewise. Do you mean by imbalanced data training and
> test sets on different topics or of different sizes?
>
> -Koos
>
> Op vr 3 jan. 2020 om 14:19 schreef Eugenio Martínez Cámara <
> emcamara at decsai.ugr.es>:
>
>> Dear folks,
>>
>> First, let me know sending you my best wishes for 2020.
>>
>> I'm now working on a problem with imbalanced data, and I'm wondering if
>> there is any good paper or survey about imbalanced classification in NLP
>> that goes beyond oversampling and undersampling.
>>
>> Do you know any good paper? Please, let me send me the link.
>>
>> Thank you very much in advanced.
>>
>> Kind regards,
>> Eugenio.
>>
>> ---
>> Eugenio Martínez Cámara
>> Investigador posdoctoral en Tec. del Lenguaje Humano / Postdoctoral Researcher in Natural Language Proc.
>> Grupo de investigación SCI2S <http://sci2s.ugr.es/> / Research group SCI2S <http://sci2s.ugr.es/>
>> Dpto. Ciencias de la Computación e Inteligencia Artificial / Computer Science and Artificial Intelligence department
>> Universidad de Granada
>>
>> _______________________________________________
>> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
>> Corpora mailing list
>> Corpora at uib.no
>> https://mailman.uib.no/listinfo/corpora
>
>
>
> ---
> Eugenio Martínez Cámara
> Investigador posdoctoral en Tec. del Lenguaje Humano / Postdoctoral Researcher in Natural Language Proc.
> Grupo de investigación SCI2S <http://sci2s.ugr.es/> / Research group SCI2S <http://sci2s.ugr.es/>
> Dpto. Ciencias de la Computación e Inteligencia Artificial / Computer Science and Artificial Intelligence department
> Universidad de Granada
>
>
-------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 4064 bytes Desc: not available URL: <https://mailman.uib.no/public/corpora/attachments/20200103/a6b06186/attachment.txt>



More information about the Corpora mailing list