Submissions are invited for MASALA (Machine-learning Approaches to Sentiment Analysis and Learning Algorithms), an ICML14 workshop exploring the new frontiers of big data computing for opinion mining through machine-learning techniques and sentiment learning methods. For more information, please visit: http://sentic.net/masala
RATIONALE The distillation of knowledge from social media is an extremely difficult task as the content of today's Web, while perfectly suitable for human consumption, remains hardly accessible to machines. The opportunity to capture the opinions of the general public about social events, political movements, company strategies, marketing campaigns, and product preferences has raised growing interest both within the scientific community, leading to many exciting open challenges, as well as in the business world, due to the remarkable benefits to be had from marketing and financial market prediction.
Statistical NLP has been the mainstream NLP research direction since late 1990s. It relies on language models based on popular machine-learning algorithms such as maximum-likelihood, expectation maximization, conditional random fields, and support vector machines. By feeding a large training corpus of annotated texts to a machine-learning algorithm, it is possible for the system to not only learn the valence of keywords, but also to take into account the valence of other arbitrary keywords, punctuation, and word co-occurrence frequencies. However, standard statistical methods are generally semantically weak if they merely focus on lexical co-occurrence elements with little predictive value individually.
Endogenous NLP, instead, involves the use of machine-learning techniques to perform semantic analysis of a corpus by building structures that approximate concepts from a large set of documents. It does not involve prior semantic understanding of documents; instead, it relies only on the endogenous knowledge of these (rather than on external knowledge bases). The advantages of this approach over the knowledge engineering approach are effectiveness, considerable savings in terms of expert manpower, and straightforward portability to different domains. Endogenous NLP includes methods based either on lexical semantics, which focuses on the meanings of individual words (e.g., LSA, LDA, and MapReduce), or compositional semantics, which looks at the meanings of sentences and longer utterances (e.g., HMM, association rule learning, and probabilistic generative models).
TOPICS MASALA aims to provide an international forum for researchers in the field of machine learning for opinion mining and sentiment analysis to share information on their latest investigations in social information retrieval and their applications both in academic research areas and industrial sectors. The broader context of the workshop comprehends opinion mining, social media marketing, information retrieval, and natural language processing. Topics of interest include but are not limited to: • Endogenous NLP for sentiment analysis • Sentiment learning algorithms • Big social data analysis • Opinion retrieval, extraction, classification, tracking and summarization • Domain specific sentiment analysis and model adaptation • Emotion detection • Sentiment pattern mining • Concept-level sentiment analysis • Biologically-inspired opinion mining • Social-network motivated methods for natural language processing • Topic modeling for aspect-based sentiment analysis • Learning to rank for social media • Content-based and social-based recommendation • Multimodal sentiment analysis • Content-, concept-, and context-based sentiment analysis
TIMEFRAME • April 20th, 2014: Submission deadline • May 11th, 2014: Notification of acceptance • May 18th, 2014: Final manuscripts due • June 25th, 2014: Workshop date
ORGANIZERS • Yunqing Xia, Tsinghua University (China) • Erik Cambria, National University of Singapore (Singapore) • Newton Howard, MIT Media Laboratory (USA)