[Corpora-List] 2 open positions for Arabic speakers to work on news analysis

Ralf Steinberger ralf.steinberger at jrc.it
Wed Feb 6 15:46:12 CET 2008

Apologies for multiple postings!

The European Commission's Joint Research Centre (JRC) in Ispra, Northern Italy, is looking for two native Arabic speakers with basic IT skills to adapt its public news aggregation and analysis web portals to Arabic. Applicants should be available for a minimum of three months, better more.

One position is an internship position, the other person would work for an external IT service provider. Both persons would work out of the offices of the JRC.

Location: Ispra, at the Lago Maggiore in Italy, 60 km West of Milan;

Host: European Commission - Joint Research Centre (JRC)

Starting date: April 2008 or later;

Duration: 3 to 12 months;

Position 1: traineeship / internship / stage / Praktikum / tirocino;

Remuneration 1: 963 Euro per month + travel allowance;

Position 2: contractor;

Remuneration 2: ca. 100 Euro per working day, after taxes;

Working language: English;

Activity: Web Technology, Language Technology; many other subject areas

URL: http://langtech.jrc.it/, <http://emm.jrc.it/overview.html> http://emm.jrc.it/overview.html, <http://www.jrc.it/> http://www.jrc.it/;

Deadline: To be filled as soon as possible.

Contact: Erik.Van-der-Goot at jrc.it

The JRC has developed and is running several public news aggregation and analysis web portals (see http://emm.jrc.it/overview.html) and provides a number of services to a wide range of international customers. Arabic is one of the 35 languages currently covered, but no user interfaces are currently provided for this language and tools should be further tuned to this language. Tasks include:

- Translate interface menus;

- Translate, optimise and test Boolean search expressions for text classification;

- Identify more Arabic language news sources;

- Help write the XSLT conversion programs that extract the news texts from the raw web pages;

- Provide linguistic resources for information extraction programs (persons, organisations, locations, quotations, relations, events)

Applicants must have the following qualifications:

- Required: Arabic native speaker competence;

- Required: good knowledge of read, written and spoken English;

- Required: Sensitivity for language, knowledge of regional differences;

- Required: Basic IT skills, XML;

- Beneficial: further IT skills, web technology, HTML, XSLT, Java, Perl, Oracle, etc.;

- Beneficial: knowledge of further natural languages;

The JRC's news aggregation and analysis applications contribute added value to the world of the written media:

- Unbiased reporting by aggregating news from multiple sources in many countries;

- Transparency: users see the viewpoints of the others, even across languages;

- Live information: updated every ten minutes;

- Multilingual: between 19 and 35 languages are covered;

- Cross-lingual information access;

- Aggregation of information from multiple documents and from many languages.

For more information on traineeships, cost of living, location, etc., see http://langtech.jrc.it/WorkatJRC.html.

Ralf Steinberger ( <mailto:Ralf.Steinberger at jrc.it> Ralf.Steinberger at jrc.it)

European Commission - Joint Research Centre (JRC) IPSC - SeS - Language Technology URL: Applications: <http://emm.jrc.it/overview.html> http://emm.jrc.it/overview.html URL: The science behind them: <http://langtech.jrc.it/> http://langtech.jrc.it.

JRC-Acquis Multilingual Parallel Corpus (Version 3)

* Freely available for research purposes.

* 22 languages: Bulgarian, Czech, Danish, German, Greek, English, Spanish, Estonian, Finnish, French, Hungarian, Italian, Lithuanian, Latvian, Maltese, Dutch, Polish, Portuguese, Romanian, Slovak, Slovene and Swedish.

* Altogether over 1 Billion words.

* Sentence alignment for 231 language pairs.

* For more information and download, see <http://langtech.jrc.it/JRC-Acquis.html> http://langtech.jrc.it/JRC-Acquis.html.

DGT-Translation Memory

* Freely available for research purposes.

* Aligned translation units for 231 language pairs.

* Alignment manually verified.

* For more information and download, see http://langtech.jrc.it/DGT-TM.html.

The JRC's Language Technology group specialises in the development of highly multilingual text analysis tools and in cross-lingual applications. Many applications are accessible online, e.g.:

* <http://press.jrc.it/NewsExplorer/> NewsExplorer: multilingual news aggregation and analysis (19 languages); allows to navigate the news over time and across languages; trend analysis; collects information about people from the news; social network detection.

* <http://press.jrc.it/> NewsBrief: breaking news detection and display of the very latest thematic news from around the world; email alerting (22+ languages).

* <http://medusa.jrc.it/> MedISys Medical Information System: latest health-related news from around the world according to themes and diseases (22+ languages).

* EMM-Labs <http://emm-labs.jrc.it:8080/> : Latest developments; social networks; live people-in-the-news; country and theme fact sheets; maps showing violent events world-wide.

-------------- next part -------------- An HTML attachment was scrubbed... URL: https://mailman.uib.no/public/corpora/attachments/20080206/76362d12/attachment.html

More information about the Corpora mailing list