[Corpora-List] Multilingual text analysis - job opening at the EC's Joint Research Centre in 2005

Ralf Steinberger RatzS at web.de
Thu Dec 23 08:31:00 CET 2004

At the European Commission’s Joint Research Centre in Ispra, North Italy, we expect to get, in the course of the year 2005, one or two 3-year research and development positions in the wider field of multilingual text analysis. As we are obliged to choose our candidates from a database called ELSA and cannot accept applicants that are not in this database, we strongly encourage interested persons to register with this database. Registering with ELSA is rather easy (see http://www.cordis.lu/research_openings/personnel_elsa_en.htm) if you have an updated CV at hand. Please make sure to choose ‘Computational Linguistics’ as one of the discipline keywords describing your background to ensure that we find your application in this large database, which serves all research parts of the European Commission.

The JRC’s Language Technology work

The JRC’s Language Technology group specialises in multilingual text analysis applications providing cross-lingual information access and allowing users in the European Commission and in EU Member State institutions to explore and navigate large multilingual document collections. See http://www.jrc.it/langtech for details about our work, and http://www.jrc.it/langtech/WorkatJRC.html for information on contract types, internships, our location, etc.


In order to provide services (such as automatic news digest and analysis) to EC users from all 25 EU countries, we are, in principle, interested in working with all twenty official EU languages, including the languages of EU-15 (English, French, German, Spanish, Italian, Portuguese, Dutch, Danish, Greek, Finnish and Swedish) and those of the ten new EU Member States (Latvian, Lithuanian, Estonian, Czech, Hungarian, Slovene, Maltese, Polish, Slovak). We are furthermore interested in the languages of the EU Accession Countries (Bulgarian, Romanian, Croat, Turkish) and in a selection of world languages (including Arabic and Russian).

Disciplines and methods

Due to the large number of languages of interest, we mainly make use of statistical and Machine Learning techniques and we try to exploit existing multilingual thesauri and nomenclatures. However, due to the rising interest in information extraction and event template filling, we intend to include more linguistic, rule-based techniques in the near future, for a subset of languages.

Fields of interest

Our fields of activity and interest include document retrieval, information extraction, named entity recognition, event template filling, terminology extraction, thesaurus indexing, multilingual classification and clustering, document relevance-ranking, monolingual and cross-lingual document similarity calculation, news analysis, topic detection, topic tracking, visualisation of textual information, the exploitation of parallel corpora (some of which are available in 20 languages), multilingual dictionary generation, etc.

Profile of the applicants

We are searching for computational linguists or people with a background in machine learning, linguistics, statistics, computer science, or related areas. Applicants should have good programming skills and an interest in producing hands-on results. People with either developer or researcher profiles are welcome to apply. Applicants must have the nationality of one of the 25 European Union Member States.

Verschicken Sie romantische, coole und witzige Bilder per SMS!
Jetzt neu bei WEB.DE FreeMail: http://freemail.web.de/?mc=021193

More information about the Corpora-archive mailing list