[Corpora-List] WebIsALOD - Large-scale Hypernymy Dataset Released

Heiko Paulheim heiko at informatik.uni-mannheim.de
Thu May 18 10:25:16 CEST 2017


Dear all,

the Data and Web Science group at University of Mannheim is happy to announce the first release of the WebIsA database [1] as a Linked Open Data endpoint. The dataset contains 11.7 million hypernym or subsumption relations ("is a") collected from the Web (e.g., "iPhone 4 is a smartphone"), using a set of Hearst-like patterns (see the paper [2] for details). We provide the data together with confidence scores, rich provenance information, as well as interlinks to DBpedia and YAGO. All in all, the dataset contains more than 470M triples.

The dataset is available at [3] as a Linked Data endpoint, a SPARQL endpoint, and downloadable dumps.

All the best, Sven Hertling Heiko Paulheim

[1] http://webdatacommons.org/isadb [2] Julian Seitner, Christian Bizer, Kai Eckert, Stefano Faralli, Robert Meusel, Heiko Paulheim and Simone Paolo Ponzetto: A Large Database of Hypernymy Relations Extracted from the Web. In: LREC 2016. [3] http://webisa.webdatacommons.org/

-- Prof. Dr. Heiko Paulheim Data and Web Science Group University of Mannheim Phone: +49 621 181 2652 B6, 26, Room B1.16 D-68159 Mannheim

Mail: heiko at informatik.uni-mannheim.de Web: www.heikopaulheim.com



More information about the Corpora mailing list