[Corpora-List] Framework for EmoText

Alexander Osherenko osherenko at gmx.de
Sat Jan 14 10:30:07 CET 2012


Dear Alexandre,

thank you for your questions.

First, about the framework. Evidently, I was not successful in explaining the purpose of the InfoFramework. I implemented the framework to experiment and to compose software prototypes in opinion mining. However, the framework is NOT only for opinion mining or lexical processing. It is rather for statistical processing generally in every domain, for example, acoustic or neurobiological processing. Moreover, the outcomes do not have to be emotional or sentiment-based. The framework (so the name) provides the basis for experimentation and rapid prototyping and doesn't limit you to use of particular classification algorithms. Hence, since InfoFramework is based on WEKA, you can use there all WEKA algorithms available, for example, NaiveBayes or SVM or many others.

Now your questions.

which predicts a sentiment label (i.e., positive, negative or neutral)
> instead of an emotion. The approach that is currently available, though,
> is based on a dictionary of affect instead of being automatically learnt.
> I once compared EmoLib to a tool similar to yours:
>
> http://atrilla.net/index.php?article=blog&specific=35
>
> and I qualitatively found that Machine Learning approaches tend
> to perform better at the expense of having a poorer generalisation ability
> (domain-specific limitation according to the topic/s of the training text).
> Or put in a different way, affective-dictionary-based approaches perform
> more modestly but are more generalisable (thus avoiding domain transfer
> problems?). Do you have a similar feeling with respect to this?
>
I have exactly the same feeling and also much evidence. For instance, we published the "Lexical Affect Sensing: Are Affect Dictionaries Necessary to Analyze Affect?" ( http://edu.cs.uni-magdeburg.de/EC/lehre/sommersemester-2010/emotional-computing/informationen-zum-seminar/blog/annotation/AreAffectDictionariesNecssaryFulltext.pdf). In my phd, I already relied on these findings and used, for example, stylometric, grammatical or deictic features. I also used Whissell's DAL as a source of lexical features to prove my hypothesis -- the results are much poorer if you compare with opinion mining using the BNC frequency list. QED


> At least this is my mind and that's why I have not delivered a service
> with any of the learnt methods that EmoLib also implements (basically the
> Multinomial Naive Bayes, the Vector Space Model, LSA, Multinomial Logistic
> Regression and SVM). Which technique does EmoText use?
>
> In my phd, I tried to answer such core data-mining questions as choice of
classifier, feature evaluation and so on. I compared results of NaiveBayes, SVM, InformationGain with SVM. In my opinion, the choice of classifiers is not important. More important, is the choice of features and explanation of obtained results.


> Moreover, EmoLib first splits the sentences of the input text, then
> predicts the sentence-wise sentiment labels independently, and finally
> draws the affective wash at paragraph level. Do you follow a similar
> approach in EmoText?
>
> In my phd, I considered splitting texts in sentences and provided results
for, as you can call it, hybrid approach that combines the semantic and the statistical approach to opinion mining (first classification of sentences and classifying longer texts as paragraphs). However, the results are worse than classification using only lexical features.

Best Alexander

Thank you, indeed.
>
> Alex
>
>
> > Dear all,
> >
> > I put a brief description of my framework on the Internet that I
> > implemented in the context of the statistical EmoText (
> > www.socioware.de/technology.html#framework). You might notice some
> > resemblance with the WEKA Experimenter. Since I didn't have any
> brilliant
> > idea on how to name this framework, I called it simply InfoFramework.
> >
> > Although I tested the framework in the context of opinion mining, I
> assume
> > it can be used for any kind of statistical processing.
> >
> > Best
> > Alexander
> > _______________________________________________
> > UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> > Corpora mailing list
> > Corpora at uib.no
> > http://mailman.uib.no/listinfo/corpora
> >
>
>
> --
> _________________________________________________
>
> ALEXANDRE TRILLA
> B.Sc., M.Sc. in Electronics, Telecommunications
> Engineering and Information Technology
>
> Email: alex at atrilla.net
> Homepage: http://atrilla.net
> _________________________________________________
>
>
>
-------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 6172 bytes Desc: not available URL: <https://mailman.uib.no/public/corpora/attachments/20120114/2e08cdc2/attachment.txt>



More information about the Corpora mailing list