[Corpora-List] 2016 IGGSA Shared Task on Extracting Sources and Targets from (German-language) Political Speeches

Josef Ruppenhofer ruppenho at uni-hildesheim.de
Wed Mar 23 18:28:56 CET 2016


Dear list members,

We are pleased to announce the 2nd iteration of the IGGSA (Interest Group on German Sentiment Analysis; https://sites.google.com/site/iggsahome/) Shared Task on German Sentiment Analysis. It is open to any interested participants from both academia and industry.

Introduction

Sentiment analysis and opinion mining draw strong attention both in the scientific community as well as in industry. Use cases include the analysis of products, brands, companies, persuasive language and more. However, most work is performed on the English language. In line with IGGSA's mission, we put the focus on German in our shared task.

**TASK DESCRIPTION**

Sentiment Analysis on German speeches from the Swiss parliament

_Full task_ Identification of subjective expressions, their sources and targets.

_Subtask a_ Subjective expressions are given, the task is to identify opinion sources.

_Subtask b_ Subjective expressions are given, the task is to identify the opinion targets.

More detailed information can be found at http://iggsasharedtask2016.github.io

**DATA SETS**

The training data consists of 605 sentences representing continuous segments of 25 speeches on 9 different topics given before the Swiss parliament.

This data was used as test data in the first iteration of this shared task in 2014. A new adjudicated version of that gold standard data is available for download at http://iggsasharedtask2016.github.io/data/shata16_training_adjudicated.xml

(The format of the gold standard is SALSA/TIGER-XML. It can be easily viewed and edited with the SALTO-tool: http://www.coli.uni-saarland.de/projects/salsa/page.php?id=software)

A preprocessed version of the data is available as well: http://iggsasharedtask2016.github.io/data/shata16_training_adjudicated_preproc.tgz (Preprocessing includes part-of-speech tagging, lemmatization, constituency parsing, dependency parsing and named-entity recognition.)

The task will be evaluated on unseen test data of about equal size to the training data. The test data is drawn from the same source and concerns the same topics as the training data.

**SOFTWARE**

In order to encourage participation, we make freely available a lexicon-based system that participated in the previous run of the shared task in 2014. The system is also compliant with this year’s format specification.

More information regarding the system’s design can be found in: http://hildok.bsz-bw.de/files/296/04_02.pdf

The system can be downloaded by following this link: https://github.com/miwieg/german-opinion-role-extractor

**SCHEDULE**

Friday, July 29: Registration deadline Monday, August 1st, 2016: Release of evaluation data Sunday, August 14th, 2016: Submission of system runs Monday, August 29th, 2016: Participants are notified of results Thursday, September 15th, 2016: Submission of working notes Thursday, September 22th, 2016: Workshop co-located with KONVENS 2016 in Bochum Monday, October 31st, 2016: Submission of full working papers for online publication

**REGISTRATION**

July 29: Registration deadline

Please register by sending an informal email to iggsasharedtask2016 _AT_ gmail.com .

**WORKSHOP**

The workshop for the shared task will take place in Bochum/Germany on September 22nd, 2016, one day after the end of the KONVENS main conference, with which the workshop is co-located. (http://www.linguistics.rub.de/konvens16/).

**COORDINATORS**

Josef Ruppenhofer (Uni Hildesheim) Julia Maria Struss (Uni Hildesheim) Michael Wiegand (Uni Saarland)

**CONTACT AND MAILINGLIST**

If you have any questions related to the shared task, you can contact the organizers via the email address iggsasharedtask2016 _AT_ gmail.com.

Participants are encouraged to also join our mailing list at: https://groups.google.com/forum/#!forum/iggsa-shared-task-2016.

**ACKNOWLEDGEMENTS**

We are happy to acknowledge the financial support that we received from GSCL (German Society of Computational Linguistics) for annotating the training data.



More information about the Corpora mailing list