[Corpora-List] Standard ontology for document classification?

Ralf Steinberger ralf.steinberger at jrc.it
Tue Oct 3 09:12:00 CEST 2006


The multilingual (over 20 languages), wide-coverage Eurovoc thesaurus with
its approximately 6000 classes has a subset of about 60 science-oriented
classes, plus many related terms and classes in other domains that may also
be useful (e.g. politics, law, economics, trade, finance, social questions,
education, employment, transport, envirosnment, agriculture, energy,
geography). The science-oriented classes provide the major science domains,
but may not be detailed enough for your purposes. Please check out for

Eurovoc is browsable at http://europa.eu/eurovoc/ and is available free for
research purposes. For details on where to get Eurovoc, see

Eurovoc was developed for manual cataloguing of mainly parliamentary
documents, but collections of multi-label classified documents such as the
JRC-Acquis (http://langtech.jrc.it/JRC-Acquis.html) have been used to train
an automatic multi-label Eurovoc classification system.

I hope this helps. All the best,


Ralf Steinberger
European Commission - Joint Research Centre (JRC)
IPSC - SeS - Language Technology ( <http://langtech.jrc.it/>
http://langtech.jrc.it, <http://press.jrc.it/NewsExplorer/>
T.P. 267, Via Fermi 1
21020 Ispra (VA), Italy

-----Original Message-----
From: owner-corpora at lists.uib.no [mailto:owner-corpora at lists.uib.no] On
Behalf Of Xabier Saralegi Urizar
Sent: 02 October 2006 12:26
To: CORPORA at uib.no
Subject: [Corpora-List] Standard ontology for document classification?

Dear all,

I want to classify many scientific documents among different categories

based on their knowledge area, such as health, geography...

My question is whether there is a standard ontology for such a




Xabier Saralegi Urizar

Elhuyar I+G+B

Zelai Haundi kalea, 3

Osinalde industrialdea

20170 Usurbil

(+34) 943 36 30 40

xabiers at elhuyar.com / www.elhuyar.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://mailman.uib.no/public/corpora-archive/attachments/20061003/fcdde011/attachment.html

More information about the Corpora-archive mailing list