[Corpora-List] CFP: Building and Using Comparable Corpora at ACL'17, Vancouver, Canada

Serge Sharoff S.Sharoff at leeds.ac.uk
Sat Apr 22 22:14:03 CEST 2017


Dear all,

the deadline for submitting to the BUCC workshop has been extended to the end of Thursday, 27 April. The links are: https://comparable.limsi.fr/bucc2017/  https://www.softconf.com/acl2017/bucc/ 

Those who have already submitted are welcome to resubmit new versions using the link above.

Best wishes, Serge

On Sun, 2017-04-16 at 18:20 +0000, Serge Sharoff wrote:
> Dear all,
>
> this is a reminder about Friday, 21 April, the coming deadline for submitting
> papers to the Building and Using Comparable Corpora Workshop, see full
> information below. 
>
> For participants in the shared task, the test data will be also released on
> Friday, 21 April, with the deadline for submitting the results on 28 April.
>
> Best wishes,
> Serge
>
> https://comparable.limsi.fr/bucc2017/bucc2017-task.html
>
> 10th Workshop on Building and Using Comparable Corpora 
> Shared task: detection of parallel sentences in Comparable Corpora
>
> Important dates
> Workshop Submission deadline: 21 April, 2017
> Workshop Notification:  19 May, 2017
> Workshop Camera Ready:  26 May, 2017
>
> Website: https://comparable.limsi.fr/bucc2017/
>
> *Shared task:  Identifying parallel sentences in comparable corpora*
>
> We announce a new shared task for 2017. As is well known, a bottleneck
> in statistical machine translation is the scarceness of parallel
> resources for many language pairs and domains. Previous research has
> shown that this bottleneck can be reduced by utilizing parallel
> portions found within comparable corpora. These are useful for many
> purposes, including automatic terminology extraction and the training
> of statistical MT systems.
>
> The aim of the shared task is to quantitatively evaluate competing
> methods for extracting parallel sentences from comparable monolingual
> corpora, so as to give an overview on the state of the art and to
> identify the best performing approaches.  
>
> Shared task sample set release: 6 February, 2017
> Shared task training set release: 13 February, 2017
> Shared task test set release: 21 April, 2017
> Shared task test submission deadline: 28 April, 2017
> Shared task camera ready papers: 26 May, 2017
>
> Any submission to the shared task is expected to be accompanied 
> by a short paper (4 pages plus references).  This will be accepted 
> for publication in the workshop proceedings automatically, although 
> the submission will go via Softconf with the standard peer-review 
> process.
>
> Motivation
>
> In the language engineering and the linguistics communities, research
> in comparable corpora has been motivated by two main reasons. In
> language engineering, it is chiefly motivated by the need to use
> comparable corpora as training data for statistical NLP applications
> such as statistical machine translation or cross-lingual retrieval. In
> linguistics, on the other hand, comparable corpora are of interest in
> themselves by making possible intra-linguistic discoveries and
> comparisons. It is generally accepted in both communities that
> comparable corpora are documents in one or several languages that are
> comparable in content and form in various degrees and dimensions. We
> believe that the linguistic definitions and observations related to
> comparable corpora can improve methods to mine such corpora for
> applications of statistical NLP. As such, it is of great interest to
> bring together builders and users of such corpora.
>
> TOPICS
>
> We solicit contributions including but not limited to the following
> topics.
>
> Building Comparable Corpora:
> • Human translations
> • Automatic and semi-automatic methods
> • Methods to mine parallel and non-parallel corpora from the Web
> • Tools and criteria to evaluate the comparability of corpora
> • Parallel vs non-parallel corpora, monolingual corpora
> • Rare and minority languages, across language families
> • Multi-media/multi-modal comparable corpora
>
> Applications of comparable corpora:
> • Human translations
> • Language learning
> • Cross-language information retrieval & document categorization
> • Bilingual projections
> • Machine translation
> • Writing assistance
> • Machine learning techniques using comparable corpora
>
> Mining from Comparable Corpora:
> • Induction of morphological, grammatical, and translation rules 
>   from comparable corpora
> • Extraction of parallel segments or paraphrases from comparable 
>   corpora
> • Extraction of bilingual and multilingual translations of single 
>   words and multi-word expressions, proper names, and named 
>   entities from comparable corpora
> • Induction of multilingual word classes from comparable corpora
> • Cross-language distributional semantics
>
> Submission Information
>
>   See BUCC 2017 website: http://comparable.limsi.fr/bucc2017/
>
> Workshop organisers:
>
> Serge Sharoff (University of Leeds, UK), Chair
> Pierre Zweigenbaum (LIMSI-CNRS, Orsay, France), Shared task organiser
> Reinhard Rapp (University of Mainz, Germany)
> _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list