[Corpora-List] 2nd CFP: Towards Enhanced Interoperability for Large HLT Systems: UIMA for NLP

Katrin Tomanek katrin.tomanek at uni-jena.de
Fri Feb 15 14:08:23 CET 2008



Towards Enhanced Interoperability for Large HLT Systems:


Full-day workshop held in conjunction with LREC 2008, May 31, 2008, Marrakech, Morocco


Submission deadline: 29 February 2008 *********************************************************

The development and incremental modification of large and complex HLT systems has long been an art rather than a workflow guided by software engineering practices and principles. The interoperability of system components was hard to achieve, exchange of different modules a pain-staking task due to the low level of abstraction of specifications which described interfaces to connect with each other, and data and control flow interdependencies between various modules.

UIMA, the Unstructured Information Management Architecture, is an open-platform middleware structure for dealing with unstructured information (text, speech, audio, video data), originally launched by IBM. In the meantime, the Apache Software Foundation has established an incubator project for developing UIMA-based software (http://incubator.apache.org/uima/). The Organization for the Advancement of Structured Information Standards (OASIS) has established a Technical Committee to standardize the UIMA specification. Accordingly, an increasing number of NLP research institutes as well as HLT companies all over the world are basing their system development efforts on UIMA specifications to adhere to emerging standards.

As far as NLP proper is concerned, Carnegie Mellon University's Language Technology Institute is hosting an UIMA Component Repository web site (http://uima.lti.cs.cmu.edu), where developers can post information about their analytics components and anyone can find out more about free and commercially available UIMA-compliant analytics. Additionally, free analytic tools that can work with UIMA include those from the General Architecture for Text Engineering (GATE - http://gate.ac.uk/) and OpenNLP (http://opennlp.sourceforge.net/) communities, as well as Jena University’s Language & Information Engineering (JULIE) (http://www.julielab.de) Lab. Commercial analytics are available from IBM, as well as from other software vendors such as Attensity, ClearForest, Temis and Nstein.

In this workshop we want to bring representatives from various NLP research sites together who have gained experience in working with UIMA specifications in the framework of complex NLP systems. As there are already several large HLT systems that have been integrated with UIMA, we also encourage papers at the workshop which are in the form of case studies on those systems. We also aim at joining the results of their work to discuss and possibly elaborate on emerging UIMA standards for NLP systems.


Paper Submissions ===================

We seek high-quality papers which report on experience using UIMA for the design and implementation of complex NLP systems. However, papers reporting on experience using other middleware frameworks and software engineering practices in this context are also welcome. Both papers from academia as well as industry are solicited. The size of long paper should not exceed 8 pages, the size of short papers and demo descriptions should not exceed 4 pages (using the LREC formatting style). For details, please consult the submission section on the workshop website.


Important Dates ==================

February 29, 2008 - Deadline for workshop paper March 26, 2008 - Notification of acceptance April 4, 2008 - Camera-ready papers due May 31, 2008 - Workshop in Marrakech

For any inquiries regarding the workshop please contact Udo Hahn (udo.hahn at uni-jena.de).


Organising Committee =======================

Udo Hahn (Jena University, Germany) Thilo Götz (IBM Germany, Germany) Eric W. Brown (IBM T.J. Watson Research Center, USA) Hamish Cunningham (University of Sheffield, UK) Eric Nyberg (Carnegie-Mellon University, USA)


Program Committee ====================

Branimir Boguraev (IBM Watson Research Center, USA) Eric W. Brown (IBM T.J. Watson Research Center, USA) Ekaterina Buyko (Jena University, Germany) Hamish Cunningham (University of Sheffield, UK) Dave Ferrucci (IBM T.J. Watson Research Center, USA) Stefan Geissler (TEMIS Deutschland, Germany) Thilo Götz (IBM Germany, Germany) Iryna Gurevych (TU Darmstadt, Germany) Udo Hahn (Jena University, Germany) Marti Hearst (University of California, Berkeley, USA) Larry Hunter (University of Colorado, USA) Nancy Ide (Vassar College, USA) Eric Nyberg (Carnegie-Mellon University, USA) Sameer Pradhan (BBN, USA) Dietmar Roesner (University of Magdeburg, Germany) John Tait (University of Sunderland, UK) Graham Wilcock (University of Helsinki, Finland)

More information about the Corpora mailing list