[Corpora-List] Arabic Natural Dialect Processing Workshop New deadlines March 20th

Mourad Abbas abb.mourad at gmail.com
Thu Mar 6 12:14:54 CET 2014

Dear all,

The deadlines of the Arabic Natural Dialect Processing Workshop are extendend to March 20th, 2014.

Best regards

Mourad Abbas https://sites.google.com/site/mouradabbas9/home

Important Dates DatesSubmission DeadlineMarch 10, 2014 March 20, 2014Notification of AcceptanceMarch 25, 2014Camera Ready SubmissionMarch 31, 2014Last Day for RegistrationMarch 31, 2014, however, it is recommended to do it few days beforeConference DatesApril 9-11, 2014

Arabic Natural Dialect Processing


to be held in conjuction with The International Conference on Computing Technology and Information Management (ICCTIM2014) Dubai, UAE - April 9-11, 2014


*Modern Standard Arabic (MSA)* is the language of more than 250 million persons. It is used mainly in writing and in formal speech. Unfortunately, most of Arab people, do not use MSA in their daily conversations; the result is that different Arabic dialects are spoken through more than twenty countries. In fact, MSA is not acquired as a mother tongue, but rather it is learned as a second language at school and through exposure to formal broadcast programs (such as the daily news), religious practice, and newspaper. Spoken Arabic is often referred to as colloquial Arabic, dialects, or vernaculars. It's a mixed form, which has many variations, and often a dominating influence from local languages (from before the introduction of Arabic) and from languages of the countries which occupied the Arabic region. Differences between the various variants of spoken Arabic can be large enough to make them incomprehensible to Arabic people coming from different regions.

Hence, regarding the large differences between such spoken languages, we can consider them as disparate languages or more exactly as different dialects depending on the geographical place in which they are practiced: Morocco, Algeria, Egypt,... Because in general, they are not written therefore, corpora are not available. Everyone knows the importance of such corpora when we would like to mine texts or to develop some applications as speech recognition or machine translation which are based on statistical models. The only existing corpora but not yet explored are those used in social networks which cannot be used easily due to the multiplicity of formats, the number of foreign words, the mixture between dialects and French or English and so on. The objective of this workshop is:

This workshop is an opportunity for the NLP community to focus on this challenging topic and encourage them to develop new resource Arabic dialect, Arabic dialect corpora processing tools and help in highlighting the difficulties of processing Arabic dialects especially those which use so many foreign words adapted lexically and grammatically to Arabic. The workshop topics include but not limited to:

1. Collecting Arabic dialect corpora

2. Diacritization of Arabic dialects

3. Mining Arabic social networks

4. Language modelling

5. Arabic dialect morphology

6. Development of mobile Arabic dialects applications: speech

recognition, machine translation

7. Tagging Arabic dialects corpora

8. Maghreb Arabic dialects versus Orient Arabic dialects: Linguistic


Format and duration: a full day workshop will held on April 10, 2014. The language of the workshop is English and submissions should be with respect to ICCTIM2014 paper submission instructions. All papers will be peer reviewed. Papers must be submitted electronically in PDF format as soon as possible and before March 20, 2014.

When you submit by using the OpenConf management system, please select others in the proposed topics and in keywords, enter Arabic dialect.

In all the cases, when you submit, please send an email to the chairman of the workshop: smaili at loria.fr -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 5459 bytes Desc: not available URL: <https://mailman.uib.no/public/corpora/attachments/20140306/556867c2/attachment.txt>

More information about the Corpora mailing list