[Corpora-List] CALL FOR PARTICIPATION: EACL 2014 Tutorial in Natural Language Processing for Social Media

peter ljunglöf peter.ljunglof at heatherleaf.se
Wed Mar 5 11:33:38 CET 2014


EACL Tutorial in Natural Language Processing for Social Media

Gothenburg, Sweden, 26 April 2014


There is an increasing need to interpret and act upon information from large-volume, social media streams, such as Twitter, Facebook, and forum posts. However, NLP methods face difficulties when processing social media text. We call for participation in an intermediate-to-advanced level tutorial, discussing the state of the art in processing social media text.

Key points of the tutorial include: - Characterisation of language in social media, and why it is difficult to process - In-depth examination of multiple approaches to core NLP tasks on social media text - Discussion of corpus collection and the use of crowdsourcing for annotation - Practical, legal and ethical aspects of gathering and distributing social media data and metadata - Current and future applications of social media information

The tutorial takes a detailed view of key NLP tasks (corpus annotation, linguistic pre-processing, information extraction and opinion mining) of social media content. After a short introduction to the challenges of processing social media, we will cover key NLP algorithms adapted to processing such content, discuss available evaluation datasets and outline remaining challenges.

The core of the tutorial will present NLP techniques tailored to social media, specifically: language identification, tokenisation, normalisation, part-of-speech tagging, named entity recognition, entity linking, event recognition, opinion mining, and text summarisation.

Since the lack of human-annotated NLP corpora of social media content is another major challenge, this tutorial will cover also crowdsourcing approaches used to collect training and evaluation data (including paid-for crowdsourcing with CrowdFlower, also combined with expert-sourcing and games with a purpose). We will also discuss briefly practical and ethical considerations, arising from gathering and mining social media content.

The last part of the tutorial will address applications, including summarisation of social media content, user modelling (geo-location, age, gender, and personality identification), media monitoring and information visualisation (for e.g. detecting bushfires, predicting virus outbreaks), and using social media to predict economical and political outcomes (e.g. stock price movements, voting intentions).

Web address: http://eacl2014.org/tutorial-social-media

Registration is to be made online via the EACL main registration site: http://eacl2014.org/registration

This tutorial is supported by the CHIST-ERA project uComp (www.ucomp.eu) and also by the EU FP7 project Pheme (www.pheme.eu).

Hope to see you in Göteborg!

Leon Derczynski and Kalina Bontcheva

More information about the Corpora mailing list