Search the corpus: http://www.tekstlab.uio.no/nowac/ Read about it here: http://www.hf.uio.no/tekstlab/nowac.html
It will be properly announced later.
Best, Janne Bondi Johannessen.
2010/4/8 Adriano Ferraresi <adriano at sslmit.unibo.it>
> Dear corpora members,
>
> we are happy to announce that we've recently completed work on frWaC, a new
> corpus resource for French.
>
> Like deWaC (for German), itWaC (for Italian) and ukWaC (for English), frWaC
> is a mega-corpus (~ 1.6 billion words) obtained by crawling and
> post-proccesing Web data. It is available both in a plain text version, and
> in an annotated version, which includes Part-of-Speech and lemma
> information. An earlier version of the corpus, and the procedure for its
> construction, are described here:
>
> Ferraresi, A., S. Bernardini, G. Picci and M. Baroni (2010) “Web Corpora
> for Bilingual Lexicography: A Pilot Study of English/French Collocation
> Extraction and Translation”. In Xiao, R. (ed.) Using Corpora in Contrastive
> and Translation Studies. Newcastle: Cambridge Scholars Publishing.
>
> For more details on the corpus and how to obtain it, please visit the WaCky
> project website:
>
> http://wacky.sslmit.unibo.it/
>
> Best,
>
> The WaCkies
>
> _______________________________________________
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
>
>
-- Janne Bondi Johannessen Professor, The Text Laboratory, ILN, http://www.hf.uio.no/tekstlab/ President, NEALT, http://omilia.uio.no/nealt/ University of Oslo P.O.Box 1102 Blindern, N-0317 Oslo, Norway Tel: +47 22 85 68 14, mob.: +47 928 966 34 -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 2867 bytes Desc: not available URL: <https://mailman.uib.no/public/corpora/attachments/20100417/c82fff09/attachment.txt>