[Corpora-List] COLING 2008 Information Retrieval for Question Answering (IR4QA) Workshop

Mark Greenwood mark at dcs.shef.ac.uk
Fri Feb 22 13:59:09 CET 2008

Call for Papers

COLING'08 Workshop


August 2008, Manchester, UK

Open domain question answering (QA) has become a very active research area over the past decade, due in large measure to the stimulus of the TREC Question Answering track (now a track within the recently formed Text Analysis Conference, TAC). This track addresses the task of finding *answers* to natural language (NL) questions (e.g. "How tall is the Eiffel Tower?" "Who is Aaron Copland?" "What effect does second-hand smoke have on non-smokers?") from large text collections. This task stands in contrast to the more conventional IR task of retrieving *documents* relevant to a query, where the query may be simply a collection of keywords (e.g. "Eiffel Tower", "American composer, born Brooklyn NY 1900, ...").

Finding answers requires processing texts at a level of detail that cannot be carried out at retrieval time for very large text collections. This limitation has led many researchers to rely on, broadly, a two stage approach to the QA task. In stage one a subset of question-relevant texts are selected from the whole collection. In stage two this subset is subjected to detailed processing for answer extraction. Clearly performance at stage two is bounded by performance at stage one, and previous work has shown that, despite the sophistication of standard IR ranking algorithms, they are not well suited to the stage one task of retrieving relevant documents given short natural language questions. It is likely that improvements in this area will come from linguistic insights into why QA focused IR is different from the traditional IR model.

With the continued expansion of QA research into more complex question types and with the speed with which answers are returned becoming an issue, the importance of having good, QA-focused IR techniques is likely to increase. To date this topic has received limited explicit attention despite its obvious importance. This 2nd IR4QA workshop aims to address this situation by continuing to attract the attention of researchers to the specific IR challenges raised by QA.

For this workshop, we solicit papers that address any aspect of QA-focussed IR, in order to improve overall system performance. Possible topics include, but are not limited to: o parameterizations/optimizations of specific IR systems for QA o studies of query formation strategies suited to QA, e.g. named

entity pre-processing of questions o different uses of IR for different question types (eg, factoid,

list, definition, event, how, ...) o utility of term matching constraints, e.g. term proximity, for QA o analyses of differing IR techniques for QA o impact of IR performance on overall QA performance o QA-orientated corpus pre-processing, e.g. indexing POS tags,

named entities, semantically-tagged entities, relationships, etc.

rather than simply tokens o evaluation measures for assessing IR for QA o retrieval from semi-structured data - i.e. QA from Wikipedia


Important Dates ===============

Paper Submission Deadline: 28th April Notification of Acceptance: 6th June Camera-Ready Papers Due: 1st July

Submission Instructions =======================

Authors are invited to submit original, unpublished work on the topic areas of the workshop. Submissions should follow the standard two column formatting instructions for the main COLING 2008 conference. Submitted papers should be no longer than eight (8) pages in length, including references. We strongly recommend the use of the Latex and Microsoft Word style files which are available from the main conference website.

As reviewing will be blind, the paper should not include the authors' names and affiliations. Furthermore, self-references that reveal the author's identity, e.g., "We previously showed (Smith, 1991) ...", should be avoided. Instead, use citations such as "Smith previously showed (Smith, 1991) ...".

Submission will be electronic. Details will appear on the workshop web site (http://nlp.shef.ac.uk/ir4qa/2008/).

Questions regarding the submission procedure should be directed to Mark A. Greenwood (mark at dcs.shef.ac.uk).

Workshop Organizers ===================

Mark A. Greenwood Department of Computer Science, University of Sheffield

Programme Committee Members ===========================

Matthew W. Bilotti (Carnegie Mellon University) Gosse Bouma (University of Groningen) Charles Clarke (University of Waterloo) Hoa Dang (NIST) Robert Gaizauskas (University of Sheffield) Eduard Hovy (ISI) Jimmy Lin (University of Maryland) John Prager (IBM) Horacio Saggion (University of Sheffield) Jörg Tiedemann (University of Groningen) Bonnie Webber (University of Edinburgh) Ralph Weischedel (BBN)

More information about the Corpora mailing list