The first Workshop on Syntactic Analysis of Non-Canonical Language will be held in conjunction with the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT 2012) which will take place in June, 3-8, 2012 in Montreal, Canada.
Important Dates *** new submission deadline *** --------------- Apr 04, 2012 Paper submission deadline Apr 27, 2012 Notification of acceptance May 07, 2012 Camera-ready deadline Jun 08, 2012 SANCL workshop at NAACL-HLT 2012
Workshop Description -------------------- The SANCL workshop aims to provide a forum for all researchers interested in syntactic analysis and parsing of language that is “non-canonical”. By that term we mean structures with characteristics deviating from the standard written form of the language. A case in point is spoken language, but also the language of social media, computer-mediated communication in general, the interlanguage produced by language learners, or historical data. All of these pose challenges for parsing models trained on edited newspaper text as well as for the theoretical analysis of these structures.
Scope and Topics ---------------- We aim to encourage a cross-fertilisation of ideas amongst researchers working on different but related problems, such as
- What is the best strategy for parsing non-canonical language?
- Should we treat parsing of non-canonical language as a problem of robustness or domain adaptation?
- Or would it be better to develop new training data sets addressing the particular properties of the data?
- What are the pros and cons of a one-size-fits-all annotation approach and of applying annotation schemes developed for standard written text to non-canonical data?
- Can insights gained from parsing one type of non-canonical text help in parsing another?
- What are the challenges of handling the often heterogeneous nature of the data (e.g. code-switching)?
- What role does pre-processing play in the parsing of non-canonical data?
- To what extent is it necessary or desirable to perform full parsing for some kinds of non-canonical text?
- From a theoretical perspective, what are the appropriate analyses for non-canonical structures?
- How should new linguistic forms emerging from social media be analysed, e.g. the use of hashtags in Twitter?
- What is the optimal unit of analysis?
- For non-sentential units (frequent in spoken language) and especially for elliptical utterances: what kind of information is necessary for a meaningful analysis? Depending on the application, categories like "NP" or "PP" might not sufficient.
Contributions to the workshop should address the adequate syntactic representation as well as the unit of analysis for the task at hand. We welcome both theoretical and practical contributions for any grammatical framework, any parsing approach and any language.
Submission Details ------------------ Authors are invited to submit long or short papers on original, unpublished work addressing these (or related) topics. Long papers may consist of up to 8 pages of content plus two extra pages for references; short papers may consist of 4 pages of content including references. Papers should be formatted according to the NAACL 2012 guidelines (for more information please visit http://www.naaclhlt2012.org/conference/conference.php).
As the reviewing will be blind, the paper must not include the authors' names and affiliations. Furthermore, self-references that reveal the author's identity, e.g., "We previously showed (Smith, 1991) ..." must be avoided. Instead, use citations such as "Smith previously showed (Smith, 1991) ..." Papers that do not conform to these requirements will be rejected without review. In addition, please do not post your submissions on the web until after the review process is complete.
Papers that have been or will be submitted to other meetings or publications must indicate this at submission time. Please visit the workshop web page (https://sites.google.com/site/sancl2012) for more details.
Papers will be accepted until Mar, 26, 2012, (PDT, GMT-8) in PDF format via the START system (https://www.softconf.com/naaclhlt2012/SANCL2012).
Shared Task ----------- The SANCL 2012 workshop will host the *first shared task on parsing English web text* organised by Google. A session in the workshop will be devoted to presenting and discussing the results of this shared task. For more details, please visit https://sites.google.com/site/sancl2012/home/shared-task
Workshop Organizers ------------------- Ozlem Cetinoglu (IMS Stuttgart, Germany) Jennifer Foster (NCLT, DCU, Ireland) Ines Rehbein (Potsdam University, Germany)
Shared Task Organizers ---------------------- Slav Petrov (Google Research, USA) Ryan McDonald (Google Research, USA)
Program Committee ----------------- Bernd Bohnet (IMS Stuttgart, Germany) Aoife Cahill (Educational Testing Service, USA) Marie Candito (University of Paris 7, France) John Carroll (University of Sussex, UK) Jinho Choi (University of Colorado at Boulder, USA) Eric de la Clergerie (INRIA, France) Markus Dickinson (Indiana University, USA) Steffi Dipper (University of Bochum, Germany) Gulsen Eryigit (Istanbul Technical University, Turkey) Stefan Evert (University of Darmstadt, Germany) Kim Gerdes (University of Paris 3, France) Ron Kaplan (Microsoft, USA) Jonas Kuhn (IMS Stuttgart, Germany) Sandra Kübler (Indiana University, USA) Joseph Le Roux (Université Paris-Nord, France) Anke Lüdeling (Humboldt-University of Berlin, Germany) David McClosky (Stanford University, USA) Detmar Meurers (University of Tübingen, Germany) Joakim Nivre (Uppsala University, Sweden) Lilja Ĝvrelid (University of Oslo, Norway) Brian Roark (Oregon Health & Science University, USA) Kenji Sagae (University of Southern California, USA) Djamé Seddah (University of Paris 4, France) Reut Tsarfaty (Uppsala University, Sweden) Josef van Genabith (Dublin City University, Ireland) Heike Zinsmeister (University of Konstanz, Germany)
Contact Information ------------------- For general questions about the workshop, please email sancl2012contact at gmail.com. For specific questions about the shared task, please email the shared task organizers (parsingtheweb at gmail.com). Additional information about SANCL 2012 can be found at https://sites.google.com/site/sancl2012.