Title: Multitask Learning of Easy-first Hierarchical Tree LSTMs for Joint Syntactic and Semantic Arabic Dependency Parsing
Context: Collaboration between RCLN ( https://lipn.univ-paris13.fr/accueil/equipe/rcln/), LIPN, Université Paris 13, and CAMeL Lab (https://bit.ly/2M0XsAG), New York University Abu Dhabi
Host lab: LIPN, Université Paris 13, 99 Avenue Jean Baptiste Clément, 93430 Villetaneuse
Supervisors: Joseph Le Roux and Nadi Tomeh
Collaborators: Nizar Habash and Dima Taji
Start date: February 2020
Duration: 6 months
Salary: 550 euros/month
Profile and required skills:
Masters in Computer Science, Computational Linguistics, Applied
Mathematics, or Statistics
Knowledge in Natural Language Processing and Deep Learning is highly
Programming skills in Python (and libraries such as pytorch, numpy, or
How to apply: send CV and available Masters' grades to tomeh at lipn.fr and leroux at lipn.fr
In recent work on semantic parsing, Peng et al. [2017; 2018]; and Kurita and Søgaard  showed that the overlap between three different theories of semantics and their corresponding representations can be exploited to improve performance on all three tasks. This is done using multitask learning in a deep neural architecture. We would like to explore ways in which this approach can be applied to Arabic, which has rich morphology and complex morpho-syntactic interactions. We will work with two different dependency representations. The first is the Columbia Arabic Treebank (CATiB) representation [Habash and Roth, 2009], which is inspired by Arabic traditional grammar and which focus on modeling syntactic and morpho-syntactic agreement and case assignment. The second is the Universal Dependency (UD) representation for Arabic [Taji et al., 2017], which has relatively more focus on semantic/thematic relations within the sentence, and which is coordinated in design with a number of other languages [Nivre et al., 2016]. The two representations complement each other and stand to benefit from multitask learning approaches.
In this context, we propose to
(i) Extend the easy-first hierarchical LSTM parser of Kiperwasser and Goldberg  to multitask settings. We have shown that this approach can be useful for joint lexical segmentation and dependency parsing [Constant et al., 2016]. In that work we used as our single-task model the easy-first parser of Goldberg and Elhadad  trained with dynamic oracles [Goldberg and Nivre, 2013];
(ii) Apply the model to parse Arabic sentences to both CATiB and UD representations;
(ii) Employ multitask modeling insights from Peng et al. [2017; 2018]; and Kurita and Søgaard  to enhance the multitask easy-first parser.
Peng, Hao, Sam Thomson and Noah A. Smith. “Deep Multitask Learning for
Semantic Dependency Parsing.” ACL (2017).
Peng, Hao, Sam Thomson, Swabha Swayamdipta and Noah A. Smith. “Learning
Joint Semantic Parsers from Disjoint Data.” NAACL-HLT (2018).
Kurita, Shuhei and Anders Søgaard. “Multi-Task Semantic Dependency
Parsing with Policy Gradient for Learning Easy-First Strategies.” ACL
Nizar Habash and Ryan M. Roth. "CATiB: The Columbia Arabic Treebank."
Proceedings of Annual Meeting of the Association for Computational
Dima Taji, Nizar Habash, and Daniel Zeman. “Universal Dependencies for
Arabic.” Proceedings of the Workshop on Arabic Natural Language Processing
(with EACL), 2017.
Yoav Goldberg and Michael Elhadad. 2010. An efficient algorithm for
easy-first non-directional dependency parsing. In Human Language
Technologies: NAACL, pages 742–750, Los Angeles, California.
Eliyahu Kiperwasser and Yoav Goldberg. 2016. Easy-first dependency
parsing with hierarchical tree LSTMs. Transactions of the Association
for Computational Linguistics, 4, 445-461.
Mathieu Constant, Joseph Le Roux, Nadi Tomeh. Deep Lexical Segmentation
and Syntactic Parsing in the Easy-First Dependency Framework. NAACL,
2016, San Diego, United States. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 20542 bytes Desc: not available URL: <https://mailman.uib.no/public/corpora/attachments/20191216/a3981985/attachment.txt>