[Corpora-List] Final call for participation: 14th BUCC Workshop, Monday, Sept. 6, 2021 (Building and Using Comparable Corpora)

Reinhard Rapp reinhardrapp at gmx.de
Fri Sep 3 11:54:19 CEST 2021


**************************************************************

14th WORKSHOP ON BUILDING AND USING COMPARABLE CORPORA (online)

In conjunction with RANLP 2021 (online)

Monday, September 6, 2021

Workshop website: https://comparable.limsi.fr/bucc2021/

RANLP website: https://ranlp.org/ranlp2021

INVITED SPEAKERS

* Pushpak Bhattacharyya https://www.cse.iitb.ac.in/~pb/

* Tomas Mikolov https://ricaip.eu/home/responsible-research-and-innovation-strategy/research-teams/tomas-mikolov

* Sujith Ravi https://www.sravi.org/

**************************************************************

The workshop will be held on Monday, Sept. 6, 2021 using Zoom. You can still register for it on the RANLP main conference website at https://ranlp.org/ranlp2021/fees.php . The registration fee for non-presenters is 15 Euros and there is no late registration surcharge.

Please find the workshop programme below. A formatted version of it will be posted on the workshop website (URL see above) by Sept. 4. The proceedings will follow by Sept. 5.

**************************************************************

Programme 14th BUCC Workshop, Monday, Sept. 6, 2021 (subject to change)

All times are in UTC+0

For a time zone converter and time difference calculator, see e.g. https://www.timeanddate.com/worldclock/converter.html

Note that during summer time (which is applicable on Sept. 6) London is at UTC+1

8:00 - 8:05 Welcome

8:05 - 9:00

     Invited presentation

     Machine Translation in Low Resource Setting

     Pushpak Bhattacharyya

9:00 - 9:25

     EM Corpus: a comparable corpus for a less-resourced language pair

     Manipuri-English

     Rudali Huidrom, Yves Lepage and Khogendra Khomdram

9:25 - 9:40 Coffee break

9:40 - 10:05

     Mining Bilingual Word Pairs from Comparable Corpus using Apache

     Spark Framework

     Sanjanasri JP, Vijay Krishna Menon, Soman KP andKrzysztof Wolk

10:05 - 10:30

     Employing Wikipedia as a resource for Named Entity Recognition in

     Morphologically complex under-resourced languages

     Aravind Krishnan, Stefan Ziehe, Franziska Pannach and

     Caroline Sporleder

10:30 - 10:55

     Semi-Automated Labeling of Requirement Datasets for Relation

     Extraction

     Jeremias Bohn, Jannik Fischbach, Martin Schmitt,Hinrich Schütze

     and Andreas Vogelsang

10:55 - 11:20

     A Dutch Dataset for Cross-lingual Multilabel Toxicity Detection

     Ben Burtenshaw and Mike Kestemont

11:20 - 12:10 Lunch break

12:10 - 13:05 Invited presentation

     Topic: Language modeling and AI

     Tomas Mikolov

13:05 - 13:30

     Syntax-aware Transformers for Neural Machine Translation:

     The Case of Text to Sign Gloss Translation

     Santiago Egea Gómez, Euan McGill and Horacio Saggion

13:30 - 13:55

     Effective Bitext Extraction from Comparable Corpora Using a

     Combination of Three Different Approaches

     Steinţór Steingrímsson, Pintu Lohar, Hrafn Loftsson and Andy Way

13:55 - 14:10 Coffee break

14:10 - 14:35

     Majority Voting with Bidirectional Pre-translation For Bitext

     Retrieval

     Alexander G. Jones and Derry Tanti Wijaya

14:35 - 15:00

     Extracting IPA in Wiktionary: Experiments on Multilingual

     Syllabification and Stress Prediction

     Winston Wu and David Yarowsky

15:00 - 15:55 Invited presentation

     Title tba

     Sujith Ravi

15:55 - 16:00 Closing



More information about the Corpora mailing list