[Corpora-List] Release of BabelNet 4.0

Roberto Navigli navigli at di.uniroma1.it
Wed Feb 28 08:39:51 CET 2018





We are proud to announce the release of a new major version of BabelNet <http://babelnet.org> and its API, developed jointly by the Linguistic Computing Laboratory <http://lcl.uniroma1.it> of the Sapienza University of Rome under the supervision of prof. Roberto Navigli <http://wwwusers.di.uniroma1.it/~navigli/>, and Babelscape <http://babelscape.com>, a Sapienza startup company providing innovative solutions for multilingual NLP. BabelNet -- winner of the prominent paper award 2017 from the Artificial Intelligence Journal and the META prize 2015, and covered in media such as The Guardian <https://www.theguardian.com/news/2018/feb/23/oxford-english-dictionary-can-worlds-biggest-dictionary-survive-internet> and Time magazine <http://wwwusers.di.uniroma1.it/~navigli/img/Redefining_the_modern_dictionary.png> -- is today’s most far-reaching multilingual resource which, according to need, can be used as an encyclopedic dictionary, or a semantic network or a huge knowledge base. BabelNet was created by means of the seamless interlinking and integration of the largest multilingual Web encyclopedia - i.e., Wikipedia - with the most popular computational lexicon of English - i.e., WordNet, and other lexical semantic resources such as Wiktionary, OmegaWiki, Wikidata, Wikipedia infoboxes, dozens of wordnets, Wikiquote, FrameNet, VerbNet, Microsoft Terminology, GeoNames, and ImageNet. BabelNet provides multilingual synsets, i.e., concepts and named entities lexicalized in many languages and connected with large amounts of semantic relations.

Version 4.0 comes with the following features:


284 languages now covered


Wikipedia, Wiktionary, Wikidata and OmegaWiki have been updated thanks

to BabelNet live <http://live.babelnet.org>, a continuously-growing

resource with daily updates from all the sources that go to make it up


Better sense inventory thanks to the manual validation of thousands of



All existing wordnets updated


New wordnets integrated for Gaelic, Portuguese and Korean


Improved treatment of Chinese


2 million new multilingual synsets (from 14 in v3.7 to 16 million

synsets in v4)


832 million senses (was 745 million Babel senses in v3.7, increasing

language coverage considerably)


Improved management of open wordnets that are now stored with their

individual licenses


Improved version of the Java and HTTP RESTful API (

http://babelnet.org/download). The Java API comes with reengineered

interfaces and classes, additional methods for Java 8 and a Java 9-ready

packaging, support of the latest version of Lucene. Universal POS tags are

now adopted, paving the way to synsets for closed-class words. A brand-new

Python API is under development with the same interface as the Java API.

More statistics are available at: <http://babelnet.org/stats.jsp> http://babelnet.org/stats.

We are organizing a two-day summer school <http://live.babelnet.org/search?word=summer+school&lang=EN> and hackathon <http://live.babelnet.org/synset?word=hackathon&lang=EN&details=1&orig=hackathon>, with tutorials, interactive sessions and presentations targeting computational linguists, computer scientists, linguists and, more in general, BabelNet fans. We are gathering interest and preferences: if you are potentially interested, just fill in the form <https://goo.gl/forms/vbHXLmiwQ6RQtR433>! The workshop will be held either in Rome or Venice (to be decided: vote for it!).

We are looking for native speakers of German, Dutch and Chinese for linguistic annotation tasks. If you are interested, just contact us!

Kind regards,

The BabelNet team

-- ===================================== Roberto Navigli Dipartimento di Informatica Sapienza University of Rome Viale Regina Elena 295b (building G, second floor) 00161 Roma Italy Phone: +39 0649255161 - Fax: +39 06 49918301 Home Page: http://wwwusers.di.uniroma1.it/~navigli ===================================== -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 29912 bytes Desc: not available URL: <https://www.uib.no/mailman/public/corpora/attachments/20180228/1bc3ccfb/attachment.txt>

More information about the Corpora mailing list