CALL FOR PAPERS
The huge volume of information available on the Web is continuously growing. There is great interest in analyzing this information in order to fulfil specific user needs. The challenges that researchers must deal with when analyzing the content of Web pages are related to the fact that quite often they are written in natural language, and very often without any specific helpful structure. In other words, it is a problem of processing almost pure raw data, often just short texts which make the task quite challenging. In fact, short texts typically contain a small number of words whose absolute frequency is relatively low in comparison with their frequency in long documents. This makes tasks such as text categorization harder.
The exponential growth in the number of Web documents furnishes abundant proof of the necessity of analyzing short texts. For instance, digital libraries and Web-based repositories of scientific and technical information provide free access only to abstracts and not to the full texts of the documents. News, document titles, snippets, FAQs, chats, abstracts etc. are some examples of the high volume of short texts available on the Web.
With the so-called Web 2.0, the largest communication and collaborative platform, new short texts are created on daily basis as on-line evaluations of commercial products, posts of blogs or comments in social networks. Twitter, for instance, is a new successful social network technology of the Web 2.0 genre which is used by millions of people and thousands of companies to publish very short messages with the purpose of sharing experiences and/or opinions about a product or service. Due to the huge amount of information available in social media, there is a clear need for mining useful information from these messages in order to discover knowledge about the collective thinking of the crowds. Tweet analysis is considered to be potentially very important because comments, opinions, suggestions and complaints can be used to define new marketing strategies or to obtain information on companies? reputation.
In recent years there has been sufficient interest from the computational linguistics community on the efficient analysis of short texts. In fact, several tracks have been organized in the framework of the different evaluation frameworks at TREC (blog and Web tracks), CLEF (Web people search laboratory), NTCIR (opinion analysis pilot task), INEX (ad-hoc passage retrieval task), ROMIP (track on news clustering), and FIRE (ad-hoc task on retrieval from technical forums and mailing lists).
This special issue aims to collect state-of-the-art contributions to the development and use of techniques for the analysis of short texts on the Web, with special emphasis on resources of the collaborative platform of the Web 2.0. Thus, we welcome contributions that include, but are not limited to, resources of short texts such as posts of blogs, tweets, text messages, etc, as well as innovative techniques using linguistic resources for improved understanding of mono or multi-lingual short texts.
TOPICS OF INTEREST
We are particularly interested in articles showing the benefits of using such resources and techniques that include, but not limited to, the following topics:
* Categorization of short texts
* Cross-lingual short text mining on the Web
* Analysis of weblogs, tweets, text messages and snippets
* Knowledge discovery from Web 2.0
* Opinion mining in social media
* Enterprise 2.0 and market analysis
* Automatic generation of collaborative linguistic resources
* Evaluation of techniques and short text resources
* Submission deadline (abstract): March 15, 2011
* Submission deadline (full paper): March 31, 2011
* First-round reviews due: May 31, 2011
* Revised versions due: July 15, 2011
* Second-round reviews due: September 15, 2011
* Final versions due: October 31, 2011
* Special issue publication: sometimes in 2012
Eneko Agirre, University of the Basque Country Mikhail Alexandrov, Autonomous University of Barcelona Enrique Alfonseca, Google Zurich Benajiba Yassine, Philips Research North America Andrew Borthwick, Intelius Pavel Braslavski, Yandex Paul Clough, University of Sheffield José Carlos Cortizo, BrainSins Alexander Gelbukh, National Polytechnic Institute Alfio Massimiliano Gliozzo, IBM Watson Julio Gonzalo, UNED Chu-Ren Huang, The Hong Kong Polytechnic University Hitoshi Isahara, Toyohashi University of Technology Jaap Kamps, University of Amsterdam Pavel Makagonov, MIxtecTechnological University Presenit Majumder, DAIICT Gandhinagar Antonia Martí, University of Barcelona Patricio Martínez, University of Alicante Rada Mihalcea, University of North Texas Mandar Mitra, Indian Statistical Institute Manuel Montes y Gómez, INAOE Puebla Roberto Navigli, University of Rome La Sapienza Boris Novikov, St. Petersburg University Ted Pedersen, University of Minnesota Marco Pennacchiotti, Yahoo! Labs Santa Clara Efstathios Stamatatos, University of the Aegean Benno Stein, Bauhaus-Universität Weimar José Antonio Troyano, University of Seville Dan Tufi?, Romanian Academy Jan Wiebe, University of Pittsburgh Xiaofang Zhou, University of Queensland Xiaoyan Zhu, Tsinghua University Beijing
Paolo Rosso, Universidad Politécnica de Valencia, Spain Marcelo Errecalde, Universidad Nacional de San Luís, Argentina David Pinto, Benemérita Universidad Autónoma de Puebla, Mexico
Please follow the submission instructions available from the LRE webpage at http://chum.edmgr.com/
For the submission of the abstract and additional information, please contact David Pinto (dpinto at cs.buap.mx)
Paolo Rosso Head of Natural Language Engineering Lab. Dpto. Sistemas Informáticos y Computación Universidad Politécnica Valencia Spain URL: http://www.dsic.upv.es/~prosso email: prosso [at] dsic.upv.es fax: +34 963877359 tel: +34 963877007 ext. 73571
---------------------------------------------------------------- This message was sent using IMP, the Internet Messaging Program.