[Corpora-List] Word frequencies from a corpus of movie and TV show subtitles.

Mark Davies Mark_Davies
Tue Apr 14 21:26:57 CEST 2009



> where you can download the manuscript and the new frequency norms for
> (American) English based on subtitles that


> are much better than the Kucera & Francis norms and certainly for short words, than
> ** any other norm currently available. **

? ? ?

Are they based on a balanced corpus, or just subtitles from TV shows?

For frequency information from a large, balanced corpus (spoken, fiction, popular magazines, newspaper, academic), might try:

http://www.americancorpus.org

============================================ Mark Davies Professor of (Corpus) Linguistics Brigham Young University (phone) 801-422-9168 / (fax) 801-422-0906

http://davies-linguistics.byu.edu

** Corpus design and use // Linguistic databases ** ** Historical linguistics // Language variation ** ** English, Spanish, and Portuguese ** ============================================



More information about the Corpora mailing list