[Corpora-List] Metrics for corpus "parseability"

Max Chevalier chevalie at irit.fr
Mon Feb 4 20:12:06 CET 2008

Dear All,

I am a new user of this list....

I wonder if someone know some techniques to evaluate the content homogeneity of a corpora. That is to say that I would evaluate the number (few or a lot) of themes in documents....

Is anyone has some idea?

Sincerely yours,


-------------- next part -------------- A non-text attachment was scrubbed... Name: chevalie.vcf Type: text/x-vcard Size: 825 bytes Desc: not available Url : https://mailman.uib.no/public/corpora/attachments/20080204/12fc1aaa/attachment.vcf

More information about the Corpora mailing list