No subject


Wed Jun 27 16:29:31 CEST 2007


ra at uib.no
Cc :
Date : Sun, 18 Sep 2005 10:16:30 -0600=0D
=
Subject : [Corpora-List] The genre of the Web








> I'm lookin=

g for publications or URLs that look at the genre of the web in quantitat=
ive terms.

>

> In other words, if one looks at the four major genres/=

registers SPOKEN, FICTION, NEWSPAPER, ACADEMIC, most would probably agree=
that the web is more like NEWSPAPER and ACADEMIC than it is SPOKEN or FI=
CTION, although there are certainly bits and pieces of all of these genre=
s/registers on the web.

>

> I imagine that something like the followi=

ng has already been done, but it would seem that a person could look at t=
he frequency of 50-60 words or phrases in the major genres/registers of t=
he BNC, for example, and then compare this to the frequency of the same w=
ords and phrases on the Web. In quantitative terms, the web would be "mo=
st like" the register with the highest correlation coefficient.

>

> =

Three notes:

> 1) A BNC-based site like VIEW [http://view.byu.edu] allo=

ws users to quickly compare the frequency in different registers [use "Ch=
arts" on the VIEW site].

> 2) This assumes we can abstract away from th=

e basic methodological problem of calculating frequencies from the web --=
an issues that has been discussed in a number of threads here on CORPORA=
.

> 3) This is a very simplistic lexically-oriented comparison, with no =

attempt to look at syntactic features, etc.

>

> On the other hand, do=

es it even make sense to try and relate the overall genre orientation of =
the web to one of these four or five discrete genres? Would it be better=
to simply refer to it as as mix of GENRE1 + GENRE2? Going even further,=
does it make sense to even try and relate the web to pre-defined genres,=
rather than perhaps just referring to it as its own "Web" register?

> =



> Thanks in advance,

>

> Mark Davies

>

> =3D=3D=3D=3D=3D=3D=3D=3D=

=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D

> Mark Davies

> Assoc.=

Prof., Linguistics

> Brigham Young University

> (phone) 801-422-9168 /=

(fax) 801-422-0906

> http://davies-linguistics.byu.edu

>

> ** Corpus=

design and use // Linguistic databases **

> ** Historical linguistics /=

/ Language variation **

> ** English, Spanish, and Portuguese **

> =3D=3D=

=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D

> =0D

=

>

>







More information about the Corpora-archive mailing list