[Corpora-List] CfP: NLP4CMC 2015: 2nd Workshop on NLP for Computer-Mediated Communication / Social Media @ GSCL-2015

Zesch, Torsten torsten.zesch at uni-due.de
Thu Jun 25 10:01:00 CEST 2015

=== 2nd CALL FOR PAPERS (Extended Deadline: 31 July 2015) ===

NLP 4 CMC 2015: 2nd Workshop on Natural Language Processing for Computer-Mediated Communication / Social Media


Pre-conference workshop at GSCL Conference 2015, Duisburg/Germany (September 29, 2015)


Over the past decade, there has been a growing interest in collecting, processing and analyzing data from genres of social media and computer-mediated communication (CMC): As part of large corpora which have been automatically crawled from the WWW, CMC data are often regarded as an unloved ³bycatch² which is difficult to handle with NLP tools that have been optimized for processing edited text; on the other hand, these data are important parts of web corpora for all research and application contexts which require data sets that represent the diversity of genres and linguistic variation on the web. For corpus-based variational linguistics, CMC corpora are an important resource for closing the "CMC gap" both in corpora of contemporary written language and in corpora of spoken language: Since CMC and social media make up an important part of everyday communication, investigations into language change and linguistic variation need to be able to include CMC and social media data into their empirical analyses.

Nevertheless, the development of approaches and tools for processing the linguistic and structural peculiarities of CMC genres and for building CMC corpora is lacking behind the interest of dealing with these types of data in the field of language technology, corpus-based linguistics and web mining.

The goal of this workshop is to provide a platform for the presentation of results and ongoing work in adapting NLP tools for processing CMC / social media data. The 1st NLP4CMC workshop was held in September 2014 at KONVENS. Proceedings of the workshop have been published as part of the KONVENS 2014 workshop proceedings (http://www.uni-hildesheim.de/konvens2014/data/konvens2014-workshop-proceed ings.pdf).

The focus of the workshop is on German data, but submissions on NLP approaches, annotation experiments etc. for data of other European languages are also welcome as long as they can make a significant contribution to the further development of the processing of CMC phenomena.


We encourage the submission of long and short research and demo papers including, but not restricted to the following topics related to social media / CMC:

* Corpora and lexical semantic resources for the analysis of social media / computer-mediated communication * Normalization (spelling correction, ...) * Automatic preprocessing (tokenization, POS tagging, lemmatization, parsing, word sense disambiguation) * Annotation of linguistic and structural features in social media / CMC data (annotation schemas, annotation experiments, ...) * Domain adaptation * Automatic methods in corpus-based CMC / social media analysis (sentiment, summarization, trend detection, ...) * Big-data social media analysis


* Submissions due: July 31, 2015 * Notification: August 31, 2015 * Camera-ready papers (revised versions) due: September 22, 2015 * Workshop: September 29, 2015


Submissions should include the names and addresses of all authors and meet the following requirements: * Length (2-4 pages) * Authors should indicate the intended type (paper, demo, work in progress) and format of their contribution (talk / poster / demonstration) * Submissions need to be made in English and should be in PDF format * Style sheet for submissions: http://gscl2015.inf.uni-due.de/instructions-for-authors/

Submissions will be accepted via the START system: https://www.softconf.com/e/nlp4cmc2015/


Sabine Bartsch (TU Darmstadt)

Thomas Bartz (TU Dortmund)

Thierry Chanier (Université Blaise Pascal, Clermont-Ferrand)

Isabella Chiari (Università "La Sapienza", Rome)

Stefanie Dipper (Ruhr-Universität Bochum)

Stefan Evert (Universität Erlangen)

Iris Hendrickx (Radboud University Nijmegen)

Verena Henrich (Universität Tübingen)

Tobias Horsmann (Universität Duisburg-Essen)

Lothar Lemnitzer (BBAW, Berlin)

Anke Lüdeling (Humboldt-Universität Berlin)

Harald Lüngen (IDS, Mannheim)

Preslav Nakov (Qatar QCRI)

Günter Neumann (DFKI, Saarbrücken)

Nelleke Oostdijk (Radboud University Nijmegen)

Ines Rehbein (Universität Potsdam)

Roman Schneider (IDS, Mannheim)

Egon W. Stemle (EURAC, Bozen)

Angelika Storrer (Universität Mannheim)

Kay-Michael Würzner (Universität Potsdam)

---- ORGANIZERS ----

Michael Beißwenger (TU Dortmund University) Torsten Zesch (Univ. of Duisburg-Essen)

The workshop is organized by the special interest group "Social Media / Computer-Mediated Communication" of the German Society for Computational Linguistics & Language Technology (GSCL) (http://gscl.org/ak-ibk.html).

More information about the Corpora mailing list