[Corpora-List] LREC Shared task on reproduction, second CfP

Valia Kordoni evangelia.kordoni at anglistik.hu-berlin.de
Wed Jun 19 20:14:20 CEST 2019


is there going to be a dataset release per task?

Valia Kordoni

On Wed, June 19, 2019 16:01, Vossen, P.T.J.M. wrote:
> [Apologies for multiple postings]
> Shared Task on the Reproduction of Research Results in Science and
> Technology of Language
> (part of LREC 2020 conference)
> Marseille, France
> May 13-15, 2020
> http://wordpress.let.vupr.nl/lrec-reproduction
> We are very pleased to announce REPROLANG 2020, the Shared Task on the
> Reproduction of Research Results in Science and Technology of Language,
> organized by ELRA - European Language Resources Association with the
> technical support of CLARIN - European Research Infrastructure for
> Language Resources and Technology, as part of the LREC 2020 conference.
> Scientific knowledge is grounded on falsifiable predictions and thus its
> credibility and raison d’être relies on the possibility of repeating
> experiments and getting similar results as originally obtained and
> reported. In many young scientific areas, including ours, acknowledgement
> and promotion of the reproduction of research results need very much to be
> increased.
> For this reason, a special track on reproducibility is included into the
> LREC 2020 conference regular program (side by side with other sessions on
> other topics) for papers on reproduction of research results, and the
> present specific community-wide shared task is launched to elicit and
> motivate the spread of scientific work on reproduction. This initiative
> builds on the previous pioneer LREC workshops on reproducibility 4REAL
> 2016 and 4REAL 2018.
> The shared task is of a new type: it is partly similar to the usual
> competitive shared tasks --- in the sense that all participants share a
> common goal; but it is partly different to previous shared tasks --- in
> the sense that its primary focus is on seeking support and confirmation of
> previous results, rather than on overcoming those previous results with
> superior ones. Thus instead of a competitive shared task, with each
> participant struggling for an individual top system that scores as far as
> possible from a rough baseline, this will be a cooperative shared task,
> with participants struggling for systems that reproduce as close as
> possible an original complex research experiment and thus eventually
> reinforcing the level of reliability on its results by means of their
> eventually convergent outcomes. Concomitantly, like with competitive
> shared tasks, in the process of participating in the collaborative shared
> task, new ideas for improvement and new advances beyond the reproduced
> results find here an excellent ground to be ignited.
> We invite researchers to reproduce the results of a selected set of
> articles, which have been offered by the respective authors with their
> consent to be used for this shared task. Papers submitted for this task
> are expected to report on reproduction findings, to document how the
> results of the original paper were reproduced, to discuss reproducibility
> challenges, to inform on time, space or data requirements found concerning
> training and testing, to ponder on lessons learned, to elaborate on
> recommendations for best practices, etc.
> Submissions that in addition to the reproduction exercise, report also on
> results of the replication of the selected tasks with other languages,
> domains, data sets, models, methods, algorithms, downstream tasks, etc.
> are also encouraged. These should permit to gain insight also into the
> robustness of the replicated approaches, their learning curves and
> potential of incremental performance, their capacity of generalization,
> their transferability across experimental circumstances and into eventual
> real-life usage scenarios, their suitability to support further progress,
> etc.
> LREC conferences have one of the top h5-index scores of research impact
> among the world class venues for research on Human Language Technology.
> Accepted papers for the shared task will be published in the Proceedings
> of the LREC 2020 main conference. LREC Proceedings are freely available
> from ELRA and ACL Anthology. They are indexed in Scopus (Elsevier) and in
> DBLP. LREC 2010, LREC 2012 and LREC 2014 Proceedings are included in the
> Thomson Reuters Conference Proceedings Citation Index (the other editions
> are being processed).
> Substantially extended versions of papers selected by reviewers as the
> most appropriate will be considered for publication in special issues of
> the Language Resources and Evaluation Journal published by Springer (a
> SCI-indexed journal).
> November 25, 2019: deadline for paper submission (aligned with LREC 2020)
> November 27: deadline for projects in gitlab.com<http://gitlab.com> to go
> public
> February 14, 2020: notification of acceptance
> May 11-16: LREC conference takes place
> The Selection Committee has selected a broad range of papers and tasks.
> Chapter A: Lexical processing
> Task A.1: Cross-lingual word embeddings
> Artetxe, Mikel, Gorka Labaka, and Eneko Agirre. 2018. “A robust
> self-learning method for fully unsupervised cross-lingual mappings of word
> embeddings”. In Proceedings of the 56th Annual Meeting of the Association
> for Computational Linguistics (ACL 2018), pp. 789–798.
> http://aclweb.org/anthology/P18-1073
> Major reproduction comparables: Accuracy scores (tables 1 to 4).
> Task A.2: Named entity embeddings
> Newman-Griffis, Denis, Albert M Lai, and Eric Fosler-Lussier. 2018.
> “Jointly Embedding Entities and Text with Distant Supervision”. In
> Proceedings of The Third Workshop on Representation Learning for NLP, pp.
> 195–206.
> http://aclweb.org/anthology/W18-3026
> Major reproduction comparables: Spearman’s ρ scores for semantic
> similarity predictions
> (tables 3 and 4), and accuracy scores (table 6).
> Chapter B: Sentence processing
> Task B.1: POS tagging
> Bohnet, Bernd, Ryan McDonald, Gonçalo Simões, Daniel Andor, Emily Pitler,
> and Joshua Maynez. 2018. “Morphosyntactic Tagging with a Meta-BiLSTM Model
> over Context Sensitive Token Encodings”. In Proceedings of the 56th Annual
> Meeting of the Association for Computational Linguistics (ACL 2018), pp.
> 2642–2652<tel:2642-2652>.
> http://aclweb.org/anthology/P18-1246
> Major reproduction comparables: f-score values (tables 2 to 8).
> Task B.2: Sentence semantic relatedness
> Gupta, Amulya, and Zhu Zhang. 2018. “To Attend or not to Attend: A Case
> Study on Syntactic Structures for Semantic Relatedness”. In Proceedings of
> the 56th Annual Meeting of the Association for Computational Linguistics
> (ACL 2018), pp. 2116–2125<tel:2116-2125>.
> http://aclweb.org/anthology/P18-1197
> Major reproduction comparables: Pearson’s r and Spearman’s ρ scores for
> the semantic relatedness
> (table 1), and f-score values for paraphrase detection (table 2).
> Chapter C: Text processing
> Task C.1: Relation extraction and classification
> Rotsztejn, Jonathan, Nora Hollenstein, and Ce Zhang. 2018. “ETH-DS3Lab at
> SemEval-2018 Task 7: Effectively Combining Recurrent and Convolutional
> Neural Networks for Relation Classification and Extraction”. In
> Proceedings of the 12th International Workshop on Semantic Evaluation
> (SemEval 2018), pp. 689–696.
> http://aclweb.org/anthology/S18-1112
> Major reproduction comparables: precision, recall and f-score values
> (tables 3 and 4).
> Task C.2: Privacy preserving representation
> Li, Yitong, Timothy Baldwin, and Trevor Cohn. 2018. “Towards Robust and
> Privacy-preserving Text Representations”. In Proceedings of the 56th
> Annual Meeting of the Association for Computational Linguistics (ACL
> 2018), pp. 25-30.
> http://aclweb.org/anthology/P18-2005
> Major reproduction comparables: POS accuracy scores (tables 1 and 2), and
> sentiment analysis
> f-score scores (table 3).
> Task C.3: Language modelling
> Howard, Jeremy, and Sebastian Ruder. 2018. ”Universal Language Model
> Fine-tuning for Text Classification”. In Proceedings of the 56th Annual
> Meeting of the Association for Computational Linguistics (ACL 2018), pp.
> 328–339.
> http://aclweb.org/anthology/P18-1031
> Major reproduction comparables: Error rate (%) scores in sentiment
> analysis and question classification tasks (tables 2 and 3).
> Chapter D: Applications
> Task D.1: Text simplification
> Nisioi, Sergiu, Sanja Stajner, Simone Paolo Ponzetto, and Liviu P. Dinu.
> 2017.
> “Exploring Neural Text Simplification Models”. In Proceedings of the 55th
> Annual Meeting of the Association for Computational Linguistics (ACL
> 2017), pp. 85-91.
> http://aclweb.org/anthology/P/P17/P17-2014.pdf
> Major reproduction comparables: Averaged human evaluation scores, by 3
> evaluators,
> in 1 to 5 and -2 to +2 scales (table 2).
> Task D.2: Language proficiency scoring
> Vajjala, Sowmya, and Taraka Rama. 2018. “Experiments with Universal CEFR
> classifications”.
> In Proceedings of Thirteenth Workshop on Innovative Use of NLP for
> Building Educational Applications, pp. 147–153.
> http://aclweb.org/anthology/W18-0515
> Major reproduction comparables: f-score values (tables 2, 3 and 4).
> Task D.3: Neural machine translation
> Vanmassenhove, Eva, and Andy Way. 2018. “SuperNMT: Neural Machine
> Translation with Semantic Supersenses and Syntactic Supertags”. In
> Proceedings of the 56th Annual Meeting of the Association for
> Computational Linguistics (ACL 2018), pp. 67–73.
> http://aclweb.org/anthology/P18-3010
> Major reproduction comparables: BLEU scores (tables 1 and 2; plots in
> figures 2, 3 and 4).
> Chapter E: Language resources
> Task E.1: Parallel corpus construction
> Brunato, Dominique, Andrea Cimino, Felice Dell'Orletta, and Giulia
> Venturi. 2016. “PaCCSS-IT: A Parallel Corpus of Complex-Simple Sentences
> for Automatic Text Simplification”. In Proceedings of the 2016 Conference
> on Empirical Methods in Natural Language Processing (EMNLP 2016), pp.
> 351-361.
> https://aclweb.org/anthology/D16-1034
> Major reproduction comparables: data set.
> Participants are expected to obtain the data and tools for the
> reproduction from the information provided in the paper. Using the
> description of the experiment is part of the reproduction exercise.
> The START platform of LREC 2020 will be used for the submission of the
> following required elements: A paper describing the reproduction effort,
> and a link to the software and data used to obtain the results reported in
> the paper (more details below). The submitted materials and results will
> be checked by a CLARIN panel. Papers will be peer-reviewed.
> REPROLANG 2020 invites the submission of full papers from 4 pages to 8
> pages (plus more pages for references if needed). These submissions must
> strictly follow the LREC 2020 conference stylesheet which will be
> available on the conference website.
> To be checked by a CLARIN panel and the submission to be complete, the
> software used to obtain the results reported in the paper must be made
> available as a docker container through a project in gitlab. Detailed
> instructions are available at https://gitlab.com/CLARIN-ERIC/reprolang/
> For technical support, the CLARIN team can be contacted at
> reprolang-tc at clarin.eu<mailto:reprolang-tc at clarin.eu> or an issue can be
> created under https://gitlab.com/CLARIN-ERIC/reprolang/issues.
> Submissions are done via the START conference management system used by
> LREC 2020 and include the following elements:
> - url address of your gitlab.com<http://gitlab.com> project
> - url of the tar.gz with the datasets - the md5 checksum of the above
> tar.gz
> - .pdf with the paper, which must include the above url of your
> gitlab.com<http://gitlab.com> project, and the above commit hash and tag
> The project in gitlab.com<http://gitlab.com> should be made public within
> 2 days after the submission deadline.
> Papers accepted for publication will be presented in a specific session of
> the LREC main conference. There is no difference in quality between oral
> and poster presentations. Only the appropriateness of the type of
> communication (more or less interactive) to the content of the paper will
> be considered. The format of the presentations will be decided by the
> Program Committee. The proceedings will include both oral and poster
> papers in the same format.
> For a selected paper to be included in the programme and to be published
> in the proceedings, at least one of its authors must register for the LREC
> 2020 conference by the early bird registration deadline. A single
> registration only covers one paper, following the general LREC policy on
> registration. Registration service is to be found at the LREC 2020
> website.
> About the shared task:
> Piek Vossen
> p.t.j.m.vossen at vu.nl<mailto:p.t.j.m.vossen at vu.nl>
> About the preparation and submission of materials:
> reprolang-tc at clarin.eu<mailto:reprolang-tc at clarin.eu>
> REPROLANG 2020 website: http://wordpress.let.vupr.nl/lrec-reproduction
> António Branco, University of Lisbon (chair of Steering Committee)
> Nicoletta Calzolari, ILC, Pisa (co-chair of Steering Committee)
> Gertjan van Noord, University of Groningen (chair of Task Selection
> Committee)
> Piek Vossen, VU University Amsterdam (chair of Program Committee)
> Khalid Choukri, ELRA/ELDA
> Gertjan van Noord, University of Groningen (chair)
> Tim Baldwin, University of Melbourne
> António Branco, University of Lisbon
> Nicoletta Calzolari, ILC, Pisa
> Çağrı Çöltekin, University of Tuebingen
> Nancy Ide, Vassar College, New York
> Malvina Nissim, University of Groningen
> Stephan Oepen, University of Oslo
> Barbara Plank, University of Copenhagen
> Piek Vossen, VU University Amsterdam
> Dan Zeman, Prague University
> reprolang-tc at clarin.eu<mailto:reprolang-tc at clarin.eu>
> Dieter Van Uytvanck, CLARIN (chair)
> André Moreira, CLARIN
> Twan Goosen, CLARIN
> João Ricardo Silva, CLARIN and University of Lisbon
> Luís Gomes, CLARIN and University of Lisbon
> Willem Elbers, CLARIN
> Piek Vossen, VU University Amsterdam (chair)
> Gilles Adda, LIMSI-CNRS, Paris
> Eneko Agirre, Basque University
> Francis Bond, NanyangTechnical University, Singapore
> António Branco, University of Lisbon
> Nicoletta Calzolari, ILC, Pisa
> Khalid Choukri, ELRA/ELDA
> Kevin Cohen, University of Colorado Boulder
> Thierry Declerck, DFKI Saarbruecken
> Nancy Ide , Vassar College, New York
> Antske Fokkens VU University Amsterdam
> Karën Fort, University of Paris-Sorbonne
> Cyril Grouin, LIMSI-CNRS
> Mark Liberman, University of Pennsylvania
> John McCrae, Galway University
> Margo Mieskes, University of Applied Sciences Darmstadt
> Aurélie Névéol, LIMSI-CNRS
> Gertjan van Noord, University of Groningen
> Stephan Oepen, University of Oslo
> Ted Pedersen, University of Minnesota
> Senja Pollak, Jozef Stefan Institute, Ljubljana
> Paul Rayson, Lancaster University
> Martijn Wieling, University of Groningen
> _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no
> https://mailman.uib.no/listinfo/corpora

More information about the Corpora mailing list