[Corpora-List] 3rd CFP: 13th Workshop on Innovative Use of NLP for Building Educational Applications (BEA13), featuring two shared tasks!

Joel Tetreault tetreaul at gmail.com
Fri Feb 16 15:39:12 CET 2018

(apologies for cross-posting)



The 13th Workshop on the Innovative Use of NLP for Building

Educational Applications (BEA13)

New Orleans, USA; June 05, 2018

(co-located with NAACL)


*Submission Deadline: March 20, 2018*


The BEA Workshop is a leading venue for NLP innovation for educational applications. It is one of the largest one-day workshops in the ACL community. The workshop’s continuous growth illustrates an alignment between societal need and technology advances. NLP capabilities now support an array of learning domain knowledge, including writing, speaking, reading, science, and mathematics, and the related intra- (e.g., self-confidence) and inter-personal (e.g., peer collaboration) domains that support achievement in the learning domains. Within these domains, the community continues to develop and deploy innovative NLP approaches for use in educational settings. In the writing and speech domains, automated writing evaluation (AWE) and speech scoring applications, respectively, are commercially deployed in high-stakes assessment, and instructional contexts (including Massive Open Online Courses (MOOCs), and K-12 settings). Commercially-deployed plagiarism detection in K-12 and higher education settings is also prevalent. The current educational and assessment landscape in K-12, higher education, and adult learning (in academic and workplace settings) fosters a strong interest in technologies that yield user log data that can be leveraged for analytics that support proficiency measures for complex constructs across learning domains. For writing, there is a focus on innovation that supports writing tasks requiring source use, argumentative discourse, and factual content accuracy. For speech, there is an interest in advancing automated scoring to include the evaluation of discourse and content features in responses to spoken assessments. General advances in speech technology have promoted a renewed interest in spoken dialog and multimodal systems for instruction and assessment, for instance, for workplace interviews and simulated teaching environments. The explosive growth of mobile applications for game-based and simulation applications for instruction and assessment is another place where NLP has begun to play a large role, especially in language learning.

NLP for educational applications has gained visibility outside of the NLP community. First, the Hewlett Foundation reached out to public and private sectors and sponsored two competitions: one for automated essay scoring, and the other for scoring of short response items. The motivation driving these competitions was to engage the larger scientific community in this enterprise. Learning @ Scale <http://learningatscale.acm.org/las2017/> is a relatively new venue for NLP research in education. MOOCs now incorporate AWE systems to manage several thousand assignments that may be received during a single MOOC course. MOOCs for Refugees have more recently popped up in response to the current social situations. Courses include language learning, and we can imagine that AWE and other NLP capabilities could support coursework. Another breakthrough for educational applications within the CL community is the presence of a number of shared-task competitions over the last four years – including three shared tasks on grammatical error detection and correction alone. NLP/Education shared tasks, typically in the area of grammar-error detection, have seen new areas of research, such as the “Automated Evaluation of Scientific Writing <http://textmining.lt/aesw/index.html>” at BEA11 and Native Language Identification <https://sites.google.com/site/nlisharedtask/home> at BEA12. All of these competitions increased the visibility of, and interest in, our field. In conjunction with the International Joint Conference on Natural Language Processing (ACL-IJCNLP) 2015, the Natural Language Processing Techniques for Educational Applications (NLP-TEA) workshop had a shared task in Chinese error diagnosis, and NLP-TEA had additional shared tasks at the 2016, and a fourth workshop <https://sites.google.com/view/nlptea2017/> in 2017 co-located with IJCNLP <http://ijcnlp2017.org/site/page.aspx?pid=901&sid=1133&lang=en>.

The 13th BEA workshop will have oral presentation sessions and a large poster session in order to maximize the amount of original work presented. We expect that the workshop will continue to expose the NLP community to technologies that identify novel opportunities for the use of NLP in education in English, and languages other than English. The workshop will solicit both full papers and short papers for either oral or poster presentation. We will solicit papers that incorporate NLP methods, including, but not limited to: automated scoring of open-ended textual and spoken responses; game-based instruction and assessment; educational data mining; intelligent tutoring; peer review, grammatical error detection; learner cognition; spoken dialog; multimodal applications; tools for teachers and test developers; and use of corpora. Research that incorporates NLP methods for use with mobile and game-based platforms will be of special interest. Specific topics include:

** Automated scoring/evaluation for written student responses (across multiple genres)* o Content analysis for scoring/assessment o Detection and correction of grammatical and other types of errors (such as, spelling and word usage) o Argumentation, discourse, sentiment, stylistic analysis, & non-literal language o Plagiarism detection o Detection of features related to interest, motivation, and values in writing tasks

** Intelligent Tutoring (IT), Collaborative Learning Environments* o Educational Data Mining: Collection of user log data from educational applications o Game-based learning o Multimodal communication (including dialog systems) between students and computers o Knowledge representation in learning systems o Concept visualization in learning systems

** Learner cognition* o Assessment of learners' language and cognitive skill levels o Systems that detect and adapt to learners' cognitive or emotional states o Tools for learners with special needs

** Use of corpora in educational tools* o Data mining of learner and other corpora for tool building o Annotation standards and schemas / annotator agreement

** Tools and applications for classroom teachers and/or test developers* o NLP tools for second and foreign language learners o Semantic-based access to instructional materials to identify appropriate texts o Tools that automatically generate test questions o Processing of and access to lecture materials across topics and genres o Adaptation of instructional text to individual learners’ grade levels o Tools for text-based curriculum development


* Submission Deadline: Tuesday, March 20 - 23:59 EST (New York City Time) * Notification of Acceptance: Wednesday, April 04 * Camera-ready Papers Due: Monday, April 16 * Workshop: June 05


We will be using the NAACL submission guidelines and style files <http://naacl2018.org/call_for_paper.html> for the BEA13 Workshop this year. Authors are invited to submit a full paper of up to 8 pages of content with unlimited pages for references. We also invite short papers of up to 4 pages of content, including unlimited pages for references. Final camera ready versions of accepted papers will be given an additional page of content to address reviewer comments.

Previously published papers cannot be accepted. The submissions will be reviewed by the program committee. As reviewing will be blind, please ensure that papers are anonymous. Self-references that reveal the author's identity, e.g., "We previously showed (Smith, 1991) ...", should be avoided. Instead, use citations such as "Smith previously showed (Smith, 1991) ...".

We have also included conflict of interest in the submission form. You should mark all potential reviewers who have been authors on the paper, are from the same research group or institution, or who have seen versions of this paper or discussed it with you.

We will be using the START conference system to manage submissions. The link will be made live when available.


* Joel Tetreault, Grammarly (primary contact) * Jill Burstein, Educational Testing Service * Ekaterina Kochmar, University of Cambridge * Claudia Leacock, Grammarly * Helen Yannakoudakis, University of Cambridge


Task Description

The workshop will host a Shared Task on Second Language Acquisition Modeling (SLAM), using data provided by Duolingo, a popular free online computer-aided language learning (CALL) platform. Teams will be provided with “traces” of all translation and transcription exercises from 800+ language learners — annotated for errors — spanning their first 100 days of activity on Duolingo. The task is then to predict errors made by a held-out set of 100+ language learners over their first 100 days. There will be four tracks for learners of English, Spanish, French and German. We believe that this task presents several new and interesting dimensions for research in Second Language Acquisition modeling: (1) subjects are mostly beginners in their respective L2s, (2) success will likely require teams to model learning — and forgetting — over time, and (3) teams are encouraged to use features which generalize across a variety of languages (hence 4 tracks).

*URL*: http://sharedtask.duolingo.com

Task Organizers

Burr Settles (Duolingo), Erin Gustafson (Duolingo), Masato Hagiwara (Duolingo), Bozena Pajak (Duolingo), Joseph Rollinson (Duolingo), Chris Brust (Duolingo), Hideki Shima (Duolingo), Nitin Madnani (Educational Testing Service)


Task Description

Over the past decade a number of studies have been published on automatic text simplification (Specia, 2010; Saggion et al. 2015; Štajner, 2015). Text simplification systems aim to facilitate reading comprehension to different target readerships such as foreign language learners, and native speakers with low literacy levels or various kinds of reading impairments. Two main factors that impact reading comprehension addressed by these systems are lexical complexity and syntactic complexity.

Many lexical simplification systems have been proposed up to this date (Glavaš and Štajner, 2015; Paetzold and Specia, 2016). It has been shown that those systems which have a separate complex word identification (CWI) module at the beginning of their pipeline outperform those systems which treat all words as potentially complex (Paetzold and Specia, 2015). Therefore, automatic identification of words that are difficult for a given target population is an important step for building better performing lexical simplification systems. This step is known as complex word identification (CWI) (Shardlow, 2013).

The first shared task on CWI was organized at the SemEval 2016 (Paetzold and Specia, 2016b). It featured 21 teams that competed submitting 42 systems trained to predict whether words in a given context were complex or non-complex for a non-native English speaker. Following the success of the first CWI shared task at SemEval 2016 we propose the organization of a second edition of the challenge at the BEA workshop 2018.

The first edition of the CWI challenge included only English data aimed at non-native English speakers, whereas the second edition will feature a multilingual dataset (Yimam, 2017a, 2017b) and four individual tracks: (1) English monolingual CWI, (2) Spanish monolingual CWI, (3) German monolingual CWI, (4) Multilingual CWI with a French test set.

*URL*: https://sites.google.com/view/cwisharedtask2018/

Task Organizers

Chris Biemann (University of Hamburg), Shervin Malmasi (Harvard Medical School), Gustavo Paetzold (University of Sheffield), Lucia Specia (University of Sheffield), Sanja Štajner (University of Mannheim), Anais Tack (KU Leuven), Seid Muhie Yimam (University of Hamburg), Marcos Zampieri (University of Wolverhampton) -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 29612 bytes Desc: not available URL: <https://www.uib.no/mailman/public/corpora/attachments/20180216/792f0f37/attachment.txt>

More information about the Corpora mailing list