[Corpora-List] Release of EmoWordNet and ArSEL: Emotion Lexicons for English and Arabic

Gilbert Badaro (Student) ggb05 at mail.aub.edu
Tue Oct 9 11:02:02 CEST 2018


Dear Colleagues,

We, at the OMA project, are pleased to announce the release of two emotion and sentiment lexicons:

* EmoWordNet, an automatically developed emotion lexicon for English

* ArSEL, a large-scale Arabic Sentiment and Emotion Lexicon.

The lexicons are publicly available on www.oma-project.com<http://www.oma-project.com>. We hope you find them useful. Your feedback is highly appreciated. Please email ggb05 at aub.edu.lb

Brief summary

EmoWordNet [1] includes 67,738 EWN terms annotated with 8 emotion scores. The emotion scores are automatically derived from DepecheMood [2]. The 8 emotions are: afraid, amused, angry, annoyed, don't care, happy, inspired and sad. The emotion scores were automatically extracted from DepecheMood by aligning DepecheMood terms to EWN terms and then using synonymy semantic relation in EWN to propagate emotion scores to synsets and additional terms not present in DepecheMood. The textual format of EmoWordNet is as follows:

EWN_Term#POS_tag;AFRAID;AMUSED;ANGRY;ANNOYED;DONT_CARE;HAPPY;INSPIRED;SAD

ArSEL [3]: Based on the success of our previously published Arabic sentiment lexicon, ArSenL [4], and since ArSenL is linked to EWN synsets, we augmented ArSenL lemmas with emotion scores retrieved from EmoWordNet. ArSEL includes 149,634 entries corresponding to 32,196 unique lemma/POS pairs with three sentiment scores as in ArSenL: positive, negative and objective, whose sum is equal to 1, and the addition of 8 emotion scores (afraid, amused, angry, annoyed, don't care, happy, inspired and sad).

The lemma form in ArSEL, denoted by AraMorph_Lemma, follows the one adopted by LDC for easier integration in NLP tasks.

Each entry in the lexicon has the following format:

AWN_OFFSET;EWN_OFFSET;POS_Tag;AWN_Lemma;AraMorph_Lemma;Pos_Sentiment_score;Neg_Sentiment_Score;Confidence###AFRAID;AMUSED;ANGRY;ANNOYED;DONT_CARE;HAPPY;INSPIRED;SAD

More details:

[1] Gilbert Badaro, Hussein Jundi, Hazem Hajj, and Wassim El-Hajj. "EmoWordNet: Automatic Expansion of Emotion Lexicon Using English WordNet." In Proceedings of the Seventh Joint Conference on Lexical and Computational Semantics, pp. 86-93. 2018. Please use this full citation when referring to our work in your research publications.

[2] Staiano, Jacopo, and Marco Guerini. "Depechemood: a lexicon for emotion analysis from crowd-annotated news." arXiv preprint arXiv:1405.1605 (2014).

[3] Gilbert Badaro, Hussein Jundi, Hazem Hajj, Wassim El-Hajj, and Nizar Habash. 2018. ArSEL: A Large Scale Arabic Sentiment and Emotion Lexicon. OSACT 2018 co-located with LREC (2018). Please use this full citation when referring to our work in your research publications.

[4] Gilbert Badaro, Ramy Baly, Hazem Hajj, Nizar Habash, and Wassim El-Hajj. "A large scale Arabic sentiment lexicon for Arabic opinion mining." In Proceedings of the EMNLP 2014 Workshop on Arabic Natural Language Processing (ANLP), pp. 165-173. 2014.

Best Regards,

Gilbert Badaro

Email: ggb05 at aub.edu.lb

PhD-Candidate in Electrical & Computer Engineering

Opinion Mining in Arabic Project

Data Mining Group

American University of Beirut

-------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 9601 bytes Desc: not available URL: <https://mailman.uib.no/public/corpora/attachments/20181009/4b02e9fa/attachment.txt>



More information about the Corpora mailing list