[Corpora-List] Getting articles from newspapers to compile a corpus

Matías Guzmán mortem.dei at gmail.com
Thu Nov 29 19:21:11 CET 2012

Hi all,

I was wondering if anyone knows how to get every possible article from online newspapers and magazines. I was thinking something like giving a program the URL of the newspaper (e.g. www.eltiempo.com) and getting the text from all pages therein. Is that possible?

Thanks a lot,

Matías -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 356 bytes Desc: not available URL: <https://mailman.uib.no/public/corpora/attachments/20121129/d5be0d68/attachment.txt>

More information about the Corpora mailing list