[Corpora-List] CFPs: Fourth Workshop on Computational Approaches to Linguistic Code-Switching

Thamar Solorio thamar.solorio at gmail.com
Tue Jan 21 20:41:05 CET 2020


[Apologies for cross-listing]

Call for Papers

Code-switching (CS) is the phenomenon by which multilingual speakers switch back and forth between their common languages in written or spoken communication. CS is typically present on the intersentential, intrasentential (mixing of words from multiple languages in the same utterance) and even morphological (mixing of morphemes) levels. CS presents serious challenges for language technologies such as Parsing, Machine Translation (MT), Automatic Speech Recognition (ASR), information retrieval (IR) and extraction (IE), and semantic processing. Traditional techniques trained for one language quickly break down when there is input mixed in from another language. Even for problems that are considered solved for specific domains and languages, such as language identification, or part of speech tagging, performance degrades at a rate proportional to the amount and level of the mixed-language present.

This workshop aims to bring together researchers interested in solving the problem and increase community awareness of the possible viable solutions to reduce the complexity of the phenomenon. The workshop invites contributions from researchers working in NLP approaches for the analysis and processing of mixed-language data especially with a focus on intrasentential code-switching. Topics of relevance to the workshop will include the following:

-

Development of linguistic resources to support research on code-switched

data

-

NLP approaches for language identification in code-switched data

-

NLP approaches for named entity recognition in code-switched data

-

NLP techniques for the syntactic analysis of code-switched data

-

NLP techniques for higher level tasks on code-switched data, such as

Q&A, language understanding, grounding

-

Domain/dialect/genre adaptation techniques applied to code-switched data

processing

-

Language modeling approaches to code-switched data processing

-

Crowdsourcing approaches for the annotation of code-switched data

-

Machine translation approaches for code-switched data

-

Multimodal approaches to processing code switched data

-

Application of low resource processing paradigms to code switch

processing

-

Position papers discussing the challenges of code-switched data to NLP

techniques

-

Methods for improving ASR in code switched data

-

Survey papers of NLP research for code-switched data

-

Sociolinguistic aspects of code-switching

-

Sociopragmatic aspects of code-switching

Theme

This year we propose a theme for the workshop around resources and evaluation metrics and frameworks. The goal of the theme is to disseminate more broadly the data sets that are available for the research community, and to engage the community in a discussion about adopting best practices and common frameworks to enable a comprehensive evaluation of technology for code-switched data. We welcome submissions responsive to the theme, in addition to the topics listed above.

Important Dates:

Paper submission: February 20th, 2020

Notification of acceptance: March 16th, 2020

Camera ready submission deadline: April 5th, 2020

Invited Speakers:

Alan W. Black, Carnegie Mellon University

Organizing Committee:

Thamar Solorio

Associate Professor

Department of Computer Science

University of Houston

thamar.solorio at gmail.com

Research interests: syntactic analysis of code-switched data, information extraction for social media data, analysis of style in text, detection of objectionable content online

Monojit Choudhury

Principal Researcher

Microsoft Research Lab India

monojitc at microsoft.com

Research interests: computational processing of code-switched text, NLP for low resource languages, computational sociolinguistics and pragmatics.

Kalika Bali

Principal Researcher

Microsoft Research Lab India

kalikab at microsoft.com

Research interests: computational processing of code-switched text and speech, NLP for low resource languages, computational sociolinguistics.

Sunayana Sitaram

Senior Researcher

Microsoft Research Lab India

sunayana.sitaram at microsoft.com

Research interests: computational processing of code-switched spoken language, speech processing for low-resource languages, speech and language systems for multilingual communities

Amitava Das

Lead Scientist

Wipro AI Lab India

amitava.das2 at wipro.com

Research interests: Code-Mixing, Social Computing, Conversational System

Mona Diab

Principal Scientist

Amazon AWS AI

Professor of Computer Science, GWU, USA

diabmona at amazon.com

Research Interests: Code Switching, Low Resource Scenarios, Conversational AI

Workshop website:

https://cod1r.github.io/Code-Switching/

Contact workshop organizers:

codeswitching_workshop at googlegroups.com

Program Committee:

Gustavo Aguilar, University of Houston

Barbara Bullock, University of Texas at Austin

Özlem Cetinoglu, University of Stuttgart

Hila Gonen, Bar Ilan University

Sandipan Dandapat, Microsoft

A. Seza Doğruöz, Google Research

William H. Hsu, Kansas State University

Constantine Lingos, Brandeis University

Rupesh Mehta, Microsoft

Joel Moniz, Carnegie Mellon University

Adithya Pratapa, Carnegie Mellon University

Yihong Theis, Kansas State University

Jacqueline Toribio, University of Texas at Austin

Gentra Inda Winata, Hong Kong University of Science and Technology -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 36886 bytes Desc: not available URL: <https://mailman.uib.no/public/corpora/attachments/20200121/a7a76773/attachment.txt>



More information about the Corpora mailing list