[Corpora-List] post-doc in machine learning for semantic composition at the university of trento

Marco Baroni marco.baroni at unitn.it
Mon Jan 23 17:51:51 CET 2012


The CIMeC-CLIC laboratory of the University of Trento, an interdisciplinary group of researchers studying language and conceptualization using both computational and cognitive methods (clic.cimec.unitn.it) announces the availability of a 2-year Post-Doc position in machine learning, renewable up to a maximum of 4 years.

The scholarship is funded by a 5-year European Research Council Starting Grant awarded to the COMPOSES (COMPositional Operations in SEmantic SPACE) project (clic.cimec.unitn.it/composes), that aims to model the meaning of phrases and sentences with computational methods.

* Research Goals and Desired Profile *

Distributional semantics is a general framework to induce vector-based meaning representations of words from collections of naturally occurring text (corpora) on a large scale. The successful candidate will develop, in collaboration with the COMPOSES project team, novel machine learning techniques to derive distributional semantic representations of phrases and sentences from distributional representations of words and other corpus data (e.g., deriving "red dog" from corpus-based representations of "red" and "dog"). To achieve this goal, we face the hard challenge to learn output representations that are very high-dimensional vectors from inputs that are also high-dimensional vectors, that might in turn be the output of other empirically-learned functions.

The successful candidate should have experience in one or more of the following areas: regularization methods, hierarchical regression, dimensionality reduction and/or feature selection for multidimensional multiple regression learning, scaling machine learning to large multivariate and multi-level problems, dealing with very sparse data, efficient large-scale implementation of regression methods, learning algorithms for deep architectures. The research fellow must also have a strong interest in working in an interdisciplinary environment.

* The Research Environment *

The CLIC lab (clic.cimec.unitn.it) is a unit of the University of Trento's Center for Mind/Brain Sciences (CIMeC, www.unitn.it/en/cimec), an English-speaking, interdisciplinary center for research on brain and cognition whose staff includes neuroscientists, psychologists, (computational) linguists, computer scientists and physicists.

CLIC consists of researchers from the Departments of Computer Science (DISI) and Cognitive Science (DISCoF) carrying out research on a range of topics including concept acquisition, corpus-based computational semantics, combining NLP and computer vision, combining brain and corpus data to study cognition, formal semantics and theoretical linguistics. Modeling composition in distributional semantics is increasingly a focus point of CLIC, and activity in this area will grow considerably thanks to COMPOSES funds.

CLIC is part of the larger network of research labs focusing on Natural Language Processing and related domains in the Trento region, that is quickly becoming one of the areas with the highest concentration of researchers in NLP and related fields anywhere in Europe.

The CLIC/CIMeC laboratories are located in beautiful Rovereto, a lively town in the middle of the Alps, famous for its contemporary art museum, the quality of its wine, and the range of outdoors sport and relax opportunities it offers:


* Application Information *

For further information, please send an expression of interest to marco.baroni at unitn.it, attaching a CV. The position is available immediately and open until filled.

More information about the Corpora mailing list