[Corpora-List] Corpus with tables of contents

Tristan Miller miller at ukp.informatik.tu-darmstadt.de
Mon May 2 10:26:41 CEST 2016


On 30/04/16 08:36 AM, Alexander Osherenko wrote:
> I am looking for a corpus with tables of contents of scientific books
> (no matter, in what specific science).

I'm not aware of any annotated corpus (i.e., a data set where the tables of contents are all in the same, easily machine-readable format). However, you might be able to build your own annotated corpus using the many hundreds (thousands?) of scientific books that are freely available online. You could use the following catalogues as starting points:





Also, if by "scientific books" you mean to include conference proceedings, you should check out the many scientific conferences that have open-access proceedings. This includes the many decades of CL/NLP conference proceedings which are handily compiled in the ACL Anthology: http://aclweb.org/anthology/

Regards, Tristan

-- Tristan Miller, Research Scientist Ubiquitous Knowledge Processing Lab (UKP-TUDA) Department of Computer Science, Technische Universitšt Darmstadt Tel: +49 6151 162 5296 | Web: https://www.ukp.tu-darmstadt.de/

-------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: <https://mailman.uib.no/public/corpora/attachments/20160502/589b057f/attachment.asc>

More information about the Corpora mailing list