The Linguistic Data Consortium (LDC) will host the workshop "Citizen Linguistics in Language Resource Development" (CLLRD 2020) at LREC 2020 in Marseille, France, on May 16, 2020. Full details and CFP: https://sites.google.com/view/cllrd-2020/


Notwithstanding advances in data collection and processing, language related research, education and technology development continue to suffer from inadequate supply of Language Resources. To supplement traditional LR development, which typically relies upon top down support from some government or private foundation, Citizen Linguistics (the Citizen Science of Language) changes the incentive model to attract a new workforce which in turn requires a different kind of workflow. Incentives to Citizen Linguists may include the opportunities to learn and develop new skills; to socialize, compete and earn status or recognition; to document their language and promote their culture and, most importantly, to contribute directly to research and indirectly to a greater cause or social good. By offering human contributors sustained access to appropriate opportunities, activities, and incentives, we can enhance LR development well beyond what traditional direct funding alone can produce. However, along with these new incentives and workflows come new challenges whose solutions are relevant even to expert (paid) annotation.

The goal of this hybrid workshop/tutorial is two-fold. First is to provide a forum for researchers and practitioners to explore and discuss the issues, advantages and challenges of using Citizen Linguistics as a method for the creation of language resources. Second is to introduce LanguageARC<https://www.google.com/url?q=https%3A%2F%2Flanguagearc.com&sa=D&sntz=1&usg=AFQjCNE2w8To2uM3SB7tSEJ_6bqN3c1pAA>, a new Citizen Linguistics web portal for collecting language data and judgements.


There will be two sessions at the workshop. For the first session, papers are welcome on any topic related to Citizen Linguistics in the development of Language Resources including:

case studies

language specific challenges

incentive models

workforce recruitment, training and evaluation

task design, granularity and assignment

workflow and ordering

response evaluation and aggregation

the preparation of language resources from raw results and their use in research and in developing and evaluating HLTs.

For the second session, papers are welcome on any topic related specifically to the use of LanguageARC.org to create tasks that collect language data for research and development. Presenting authors of Best Papers Employing LanguageARC will receive travel subsidies to present during this workshop at LREC. The second session will also include a brief tutorial on LanguageARC for new or potential users. By the end of the tutorial, attendees will be fully capable of implementing their data collection or annotation project via LanguageARC.


We will accept papers between 4 and 8 pages excluding references. Accepted workshop papers will be published as workshop proceedings along with the main conference papers. Papers must follow the LREC 2020 style sheet<https://www.google.com/url?q=https%3A%2F%2Flrec2020.lrec-conf.org%2Fen%2Fsubmission2020%2Fauthors-kit%2F&sa=D&sntz=1&usg=AFQjCNFmOIWSe6G0eUNd1KbUUo3kbLwi2Q> and author’s kit<https://www.google.com/url?q=https%3A%2F%2Flrec2020.lrec-conf.org%2Fen%2Fsubmission2020%2Fauthors-kit%2F&sa=D&sntz=1&usg=AFQjCNFmOIWSe6G0eUNd1KbUUo3kbLwi2Q> templates. Papers are to be submitted via the workshop START page<https://www.google.com/url?q=https%3A%2F%2Fwww.softconf.com%2Flrec2020%2FCLLRD2020%2F&sa=D&sntz=1&usg=AFQjCNECB3lDt_j5f7bNFmRMS8eYgO3CMQ>.

Important Dates

- submission deadline: February 17, 2020

- notification of acceptance: March 12, 2020

- deadline for camera-ready versions: April 2, 2020

For more information please visit the workshop website (https://sites.google.com/view/cllrd-2020/) or contact James Fiumara: jfiumara AT ldc.upenn.edu

The organizing committee,

Chris Callison-Burch, University of Pennsylvania Christopher Cieri, Linguistic Data Consortium, University of Pennsylvania James Fiumara, Linguistic Data Consortium, University of Pennsylvania Mark Liberman, Linguistic Data Consortium, University of Pennsylvania

