[Corpora-List] Workshop: "Citizen Linguistics in Language Resource Development" (CLLRD 2020) at LREC 2020 in Marseille, France

Fiumara, James J jfiumara at ldc.upenn.edu
Tue Jan 21 17:21:54 CET 2020

Call For Papers:

The Linguistic Data Consortium (LDC) will host the workshop "Citizen Linguistics in Language Resource Development" (CLLRD 2020) at LREC 2020 in Marseille, France, on May 16, 2020. Full details and CFP: https://sites.google.com/view/cllrd-2020/


Notwithstanding advances in data collection and processing, language related research, education and technology development continue to suffer from inadequate supply of Language Resources. To supplement traditional LR development, which typically relies upon top down support from some government or private foundation, Citizen Linguistics (the Citizen Science of Language) changes the incentive model to attract a new workforce which in turn requires a different kind of workflow. Incentives to Citizen Linguists may include the opportunities to learn and develop new skills; to socialize, compete and earn status or recognition; to document their language and promote their culture and, most importantly, to contribute directly to research and indirectly to a greater cause or social good. By offering human contributors sustained access to appropriate opportunities, activities, and incentives, we can enhance LR development well beyond what traditional direct funding alone can produce. However, along with these new incentives and workflows come new challenges whose solutions are relevant even to expert (paid) annotation.

The goal of this hybrid workshop/tutorial is two-fold. First is to provide a forum for researchers and practitioners to explore and discuss the issues, advantages and challenges of using Citizen Linguistics as a method for the creation of language resources. Second is to introduce LanguageARC<https://www.google.com/url?q=https%3A%2F%2Flanguagearc.com&sa=D&sntz=1&usg=AFQjCNE2w8To2uM3SB7tSEJ_6bqN3c1pAA>, a new Citizen Linguistics web portal for collecting language data and judgements.


There will be two sessions at the workshop. For the first session, papers are welcome on any topic related to Citizen Linguistics in the development of Language Resources including:

case studies

language specific challenges

incentive models

workforce recruitment, training and evaluation

task design, granularity and assignment

workflow and ordering

response evaluation and aggregation

the preparation of language resources from raw results and their use in research and in developing and evaluating HLTs.

For the second session, papers are welcome on any topic related specifically to the use of LanguageARC.org to create tasks that collect language data for research and development. Presenting authors of Best Papers Employing LanguageARC will receive travel subsidies to present during this workshop at LREC. The second session will also include a brief tutorial on LanguageARC for new or potential users. By the end of the tutorial, attendees will be fully capable of implementing their data collection or annotation project via LanguageARC.


We will accept papers between 4 and 8 pages excluding references. Accepted workshop papers will be published as workshop proceedings along with the main conference papers. Papers must follow the LREC 2020 style sheet<https://www.google.com/url?q=https%3A%2F%2Flrec2020.lrec-conf.org%2Fen%2Fsubmission2020%2Fauthors-kit%2F&sa=D&sntz=1&usg=AFQjCNFmOIWSe6G0eUNd1KbUUo3kbLwi2Q> and author’s kit<https://www.google.com/url?q=https%3A%2F%2Flrec2020.lrec-conf.org%2Fen%2Fsubmission2020%2Fauthors-kit%2F&sa=D&sntz=1&usg=AFQjCNFmOIWSe6G0eUNd1KbUUo3kbLwi2Q> templates. Papers are to be submitted via the workshop START page<https://www.google.com/url?q=https%3A%2F%2Fwww.softconf.com%2Flrec2020%2FCLLRD2020%2F&sa=D&sntz=1&usg=AFQjCNECB3lDt_j5f7bNFmRMS8eYgO3CMQ>.

Important Dates

- submission deadline: February 17, 2020

- notification of acceptance: March 12, 2020

- deadline for camera-ready versions: April 2, 2020

Identify, Describe and Share your LRs!

* Describing your LRs in the LRE Map is now a normal practice in the submission procedure of LREC (introduced in 2010 and adopted by other conferences). To continue the efforts initiated at LREC 2014 about “Sharing LRs” (data, tools, web-services, etc.), authors will have the possibility, when submitting a paper, to upload LRs in a special LREC repository. This effort of sharing LRs, linked to the LRE Map for their description, may become a new “regular” feature for conferences in our field, thus contributing to creating a common repository where everyone can deposit and share data.

* As scientific work requires accurate citations of referenced work so as to allow the community to understand the whole context and also replicate the experiments conducted by other researchers, LREC 2020 endorses the need to uniquely Identify LRs through the use of the International Standard Language Resource Number (ISLRN, www.islrn.org<http://www.google.com/url?q=http%3A%2F%2Fwww.islrn.org%2F&sa=D&sntz=1&usg=AFQjCNHQWjwG71gaG7muPhItUqDXMt6wQQ>), a Persistent Unique Identifier to be assigned to each Language Resource. The assignment of ISLRNs to LRs cited in LREC papers will be offered at submission time.

For more information please visit the workshop website (https://sites.google.com/view/cllrd-2020/) or contact James Fiumara: jfiumara AT ldc.upenn.edu

The organizing committee,

Chris Callison-Burch, University of Pennsylvania Christopher Cieri, Linguistic Data Consortium, University of Pennsylvania James Fiumara, Linguistic Data Consortium, University of Pennsylvania Mark Liberman, Linguistic Data Consortium, University of Pennsylvania

-------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 16574 bytes Desc: not available URL: <https://mailman.uib.no/public/corpora/attachments/20200121/c53cf48a/attachment.txt>

More information about the Corpora mailing list