The use of language expressions and their meanings follow a Zipfian distribution, featuring a small amount of very frequent observations and a very long tail of low frequent observations. As a result of this, the data we use for learning and testing systems also exhibit overfitting to the head of the distribution. It is thus not surprising that statistical approaches automatically exploit the distributional preference and dominance of the most ‘popular’ interpretations for disambiguation and reference tasks in NLP: most of the test interpretations in a task correspond to the majority interpretations of the training instances. But what about the long-tail cases? How well are systems capable of semantically interpreting less and low-frequent cases? Surprisingly, humans do not experience significant problems determining that a long-tail interpretation applies in a specific text. But how can semantic NLP systems intelligently deal with sparse cases? Little attention has been devoted to how systems should solve interpretation tasks for local and perhaps unknown event and entity instances, which are described in sparse documents, but might not be observed in any training set or knowledge base. Potentially, this would require new representations and deeper processing than the ones that work well on the head, which involves reading between the lines, e.g. textual entailment and robust (common sense) reasoning.
It is the goal of this workshop to bring researchers together to share their experiences and results in dealing with long tail semantics: i.e. interpretation of low-frequent or rare events, entities and relations for which there is little training data. We welcome papers that describe: (error) analyses of semantic tasks with respect to head and long-tail distributions, evaluation of different methods on head and long-tail cases, new methods to interpret long-tail phenomena, and the role of context to prime long-tail cases over head interpretations. We are interested in how knowledge and data can be acquired to counterbalance popular data and interpretations and how to make semantic tasks sustainable over time when the world and the interpretation space changes.
*Topics of interest*
Topics include, but are not limited to, the following:
- System Performance
1. How can we define the head and the tail for each semantic NLP task?
2. Which evaluation metrics are needed to gain insight into system
performance on the tail?
3. Do existing datasets suffice to gain insight into tail performance?
What kind of benchmarks are needed to better track progress in processing
- Data and Knowledge Requirements
1. What kind and amount of (annotated) data is needed?
2. Do customary knowledge sources (e.g. DBpedia, BabelNet, and WordNet)
3. Do we need massive local knowledge resources to represent the world
and all its contexts?
- Methods and linguistic representations
1. Are the methods and representations needed for the tail different
from the ones for the head?
2. How can we transfer models developed for the head to make them
appropriate for modeling the tail?
3. How can the recent advances in deep neural networks and matrix
factorization be directed to accomplish this?
- Contextual adaptation
1. How to build systems that can switch between contexts of time, topic,
and location (e.g. how to build systems that can adapt to new or past
long tail realities)?
Guide for authors The deadline to submit papers is September 5, 2017. Paper submissions for IJCNLP will be handled by the Softconf START system (submission link will be provided once available). The program chairs will release both Latex and Microsoft Word templates soon.
Prospective authors should submit an extended abstract of 2 pages in length, excluding references, and will be asked to extend it to a long or short paper upon acceptance. For more information on short papers (5 pages of content + 2 pages for references) and long papers (9 pages of content + 2 pages for references), we refer to: http://ijcnlp2017.org/site /page.aspx?pid=148&sid=1133&lang=en <https://www.google.com/url?q=http://ijcnlp2017.org/site/page.aspx?pid%3D148%26sid%3D1133%26lang%3Den&sa=D&ust=1494255281412000&usg=AFQjCNGDvsG-y2sdBbPf6DrNqTgUAOy3qQ>
Important Dates Deadline for submission: September 5, 2017 Notification of acceptance: September 30, 2017 Deadline for final paper submission: October 10, 2017
Piek Vossen (Vrije Universiteit Amsterdam)
Marten Postma (Vrije Universiteit Amsterdam)
Filip Ilievski (Vrije Universiteit Amsterdam)
Martha Palmer (University of Colorado Boulder)
Chris Welty (Google)
Ivan Titov (University of Edinburgh)
Eduard Hovy (Carnegie Mellon University)
Eneko Agirre (University of the Basque Country)
Philipp Cimiano (University of Bielefeld)
Frank van Harmelen (Vrije Universiteit Amsterdam)
Key-Sun Choi (Korea Advanced Institute of Science and Technology)
Agata Cybulska (Oracle)
Anders Søgaard (University of Copenhagen)
Andre Freitas (University of Passau)
Anselmo Peñas (UNED Madrid)
Antske Fokkens (VU Amsterdam)
Barbara Plank (University of Groningen)
Brian Davis (National University of Ireland Galway)
Dirk Hovy (University of Copenhagen)
Giuseppe Rizzo (ISMB, Turin)
Jacopo Urbani (VU Amsterdam/Max Planck Institute for Informatics)
Johan Bos (University of Groningen)
Lea Frermann (University of Edinburgh)
Leon Derczynski (University of Sheffield)
Karthik Narasimhan (Massachusetts Institute of Technology)
Marco Rospocher (Fondazione Bruno Kessler, Trento)
Marieke van Erp (VU Amsterdam)
Pradeep Dasigi (Carnegie Mellon University)
Ridho Reinanda (University of Amsterdam)
Sabine Schulte im Walde (University of Stuttgart)
Sara Tonelli (Fondazione Bruno Kessler, Trento)
Sebastian Pado (Stuttgart University)
Stephan Oepen (University of Oslo)
Sujay Kumar Jauhar (Carnegie Mellon University)
Tim Baldwin (University of Melbourne)
Tommaso Caselli (VU Amsterdam)
For all enquiries, please contact: m.c.postma at vu.nl or f.ilievski at vu.nl
We look forward to seeing you at SLT-1.
Sincerely, The Organizing committee of SLT-1 -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 19680 bytes Desc: not available URL: <https://mailman.uib.no/public/corpora/attachments/20170706/be09d40a/attachment.txt>