[Corpora-List] "normalizing" frequencies for different-sized corpora

Jenny Eagleton jenny at asian-emphasis.com
Mon Sep 12 10:20:01 CEST 2005

Hello Corpora and Statistics Experts,

This is a very simple question for all the
corpora/statistics experts
out there, but this novice is not really
mathematically inclined. I
understand Biber's principle of "normalization,
however I am not sure
about how to calculate it. I want frequency counts
normalized per
1,000 words of text. I can see how to do it if the
figures are even,
i.e. if I have a corpus of 4,000 words and a
frequency of 200, 
I would have a normalized figure of 50.

But for mixed numbers, how would I calculate the
following: For
example if I have 2,646 instances of a certain
kind of noun in a
corpus of 55,166 how would I calculate the
normalized figure per
1,000 words?


Research Assistant
Dept. of English & Communication
City University of Hong Kong

-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://mailman.uib.no/public/corpora-archive/attachments/20050912/11efd2c0/attachment.html

More information about the Corpora-archive mailing list