[Corpora-List] Difference in POS tag distribution in different genres

Angus Grieve-Smith grvsmth at panix.com
Mon Dec 17 05:01:28 CET 2012

On 12/16/2012 10:24 PM, Adam Kilgarriff wrote:
> Mark Davies and Andrew Hardie have already mentioned Doug Biber's
> work, I'll just add what I think of as the key/original reference, his
> "Variation across Speech and Writing", CUP 1988.

Yes. Biber's idea was brilliant, but as I wrote a few years ago, it's very difficult to combine these measurements in a factor analysis, because there is so much potential for grammatically-motivated covariation.


Ultimately, variation is about choice, conscious or unconscious. If a newspaper writer or editor is choosing to use more proper nouns (for example) per thousand words, then they're choosing not to refer to that person, place or thing with a pronoun, or a noun, or a demonstrative or possessive pronoun. Or maybe they're choosing to refer to this person, place or thing explicitly instead of implicitly.

If those choices covary with genre, it's because of the norms of that genre and the purpose and situational limitations (medium, cognitive, temporal, etc.) of the production of each text. Unfortunately Biber's method tends to obscure these choices and connections, but I still hope that it can be the foundation for something more enlightening.


-Angus B. Grieve-Smith

grvsmth at panix.com

More information about the Corpora mailing list