Let me invite you to contact the research community that has organised the recent colloquium entitled "Towards a Reference Corpus of Web Genres" Colloquium held in conjunction with Corpus Linguistics 2007 See : - Colloquium schedule: http://corpus.leeds.ac.uk/serge/webgenres/schedule.html - Corpus Linguistics 2007 website: http://www.corpus.bham.ac.uk/conference2007
Please note that I recalled those informations from the archives of this precise mailing list (corpora), by searching in them at : http://listserv.linguistlist.org/cgi-bin/wa?S1=corpora You may find other interesting informations with that valuable tool.
I met Marina Santini (http://www.itri.brighton.ac.uk/~Marina.Santini/) in Besançon in 2006, where she presented the following paper at JADT 2006 : http://www.cavi.univ-paris3.fr/lexicometrica/jadt/jadt2006/PDF/II-077.pdf
Let me suggest warmly to you that reference as an entry point.
Best regards, Serge Heiden
Le Thursday, November 15, 2007 12:02 AM [GMT+1=CET], Ana Rita Remígio <anaritaremigio at ua.pt> a écrit :
> Hello,
>
> Does anyone know of papers (or any other references) on
> classifications of Internet textual genres (FAQs, advertisements,
> ...)? The goal is to classify different electronic documents taken
> from the Web used to build a corpus.
>
> Thank you in advance,
> Ana Rita
>
>
>
> _______________________________________________
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
_____________________________________________________________ Serge Heiden, slh at ens-lsh.fr, https://weblex.ens-lsh.fr ENS-LSH/CNRS - ICAR UMR5191, Institut de Linguistique Française 15, parvis René Descartes 69342 Lyon BP7000 Cedex, tél. +33(0)622003883