[Corpora-List] Keywords Generator (fwd)

Trevor Jenkins trevor.jenkins at suneidesis.com
Mon Feb 18 15:44:38 CET 2008


On Mon, 18 Feb 2008, True Friend <true.friend2004 at gmail.com> asked for help:

Antconc has a word frequency count feature. Why not use that?

Ben Allison has given you a UNIX solution. Here's mine

tr "[:space:]" "\n" <Sense\ and\ Sensibility.txt|tr "[:upper:]" "[:lower:]"|tr -d "[:punct:]"|sort|uniq -c|sort > SS-list

Change "Sense\ and\ Sensibility.txt" and "SS-list" to what ever your own files are call. You can tell what I've been playing with recently. ;-)

The difference between mine and Ben's is mine relies solely upon standard filters that should be available on every UNIX machine. You might not have Perl installed, which is required by Ben's version. Of course, you might not have the GNU version of textutils, which I'm relying upon. We're both sorting on ascending frequency.


> Hi Folks
I need a a programm/script (even of *nix) that can provide frequency of a wordlist from two corpora. Actually I have made this list by comparing two word lists one from general english (specifically from Pakistani Origin) and law english (also of Pakistani origin). I know want to present these keywords with their frequencies in both corpora as a proof that these words are more frequent in law. Keywords are generated by Antconc. Is there any script/tool that can generate a parallel list of frequencies of each word in both corpora? Regards M Shakir Aziz A Corpus Linguistics Student Pakistan

-- محمد شاکر عزیز

Regards, Trevor

<>< Re: deemed!



More information about the Corpora mailing list