Funded by ERC Starting Grant 715154, AMORE: A distributional MOdel of Reference to Entities (www.upf.edu/web/amore)
Application deadline: Wednesday March 18 2020
Humans can communicate in part because they share the way they refer to objects. For instance, suppose I see my neighbor's dog, a chihuahua, running in the park: do I refer to it as "the animal”, or "the dog", or "the chihuahua"? Or maybe "the chihuahua that's running towards the tree", or "the small dog on the left"? For any given object, there is a large number of different referring expressions that we could choose to use; and yet there are regularities in how people choose, and interpret, such referring expressions, as otherwise we would not be able to communicate. Despite substantial work on this topic in Computational Linguistics, Linguistics, and Cognitive Science, it is still far from clear how reference works.
This project examines reference to objects in visual data (images, perhaps also video) with two methodologies:
- Data Science for Linguistics / Cognitive Science - Artificial Intelligence: Computational modeling with Machine Learning
As for the former, the availability of large-scale data resources as well as usable computational representations (in particular distributed representations of the sort used in deep learning) allows us to address linguistic and psycholinguistic questions related to reference using Data Science techniques. The primary questions here are 1) what kind of regularities/variation do we find in different referring expressions for the same object?, 2) how do object properties, on the one hand, and contextual information, on the other, affect the choice of referring expression?
As for the latter, research in Computational Linguistics and Language and Vision has made quite a bit of progress, in terms of both data and modeling, in addressing referring expression generation and interpretation; however, there is still a long way to go for models to truly mimic human behavior. The goal of this part of the thesis will be to improve computational models of tasks related to reference, incorporating insights from the analysis mentioned above.
The emphasis can be placed more in one or the other methodology depending on the interests and experience of the successful candidate.
Part of the work can be carried out on a dataset developed within the AMORE project:
Silberer, C., S. Zarrieß, G. Boleda. 2020. Object Naming in Language and Vision: A Survey and a New Dataset. In Proceedings of LREC 2020, to appear. (Non-final version: https://gboleda.github.io/pubs/lrec2020naming-submitted.pdf)
The thesis will be carried out in the COLT research group ( www.upf.edu/web/colt). COLT is a young, dynamic, cohesive group currently consisting of 12 senior, post-doc, and PhD researchers whose interests are related to the thesis topic. Its premises are in the Communication Campus of UPF (www.upf.edu/campus/en/comunicacio), with a lively ecosystem of researchers working on Linguistics, Computer Science, and Cognitive Science, and specifically on Computational Linguistics / Natural Language Processing.
Universitat Pompeu Fabra is a small, research-oriented, highly international institution (www.upf.edu), consistently ranked top in research among Spanish universities and placed 15th worldwide in the Times Higher Education ranking "150 under 50".
Barcelona is a unique city, with a Mediterranean and cosmopolitan culture, and very livable (www.upf.edu/barcelona/en).
Applicants should submit via email to Prof. Gemma Boleda (gemma.boleda AT upf.edu) a single pdf file with:
- CV (max. 2 pages), including name and e-mail address of two academic referees; - a cover letter (max. 2 pages) explaining why you are interested in this position and how your profile fits the project.
We aim at building a diverse team; all applications are welcome, *especially those of female researchers* and members of other underrepresented collectives. Informal inquiries are welcome (gemma.boleda AT upf.edu).
Application deadline: March 18 2020 Starting date: October 1 2020 (an earlier starting date may be possible)
-- Gemma Boleda Universitat Pompeu Fabra / ICREA *https://gboleda.github.io <https://gboleda.github.io>* -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 6301 bytes Desc: not available URL: <https://mailman.uib.no/public/corpora/attachments/20200219/74733c3a/attachment.txt>