[Corpora-List] Word frequencies from a corpus of movie and TV show subtitles.

Cyrus Shaoul cyrus.shaoul
Wed Apr 15 20:36:37 CEST 2009


I did not make the corpus, but it looks like they used OCR technology to get the subtitles off the video frames.

See:

http://expsy.ugent.be/subtlexus

for more info.

-Cyrus

M.E.Sciubba wrote:
> The subtitles refer to the 'whole' script or are they the 35-character
> subtitles given in movies?
> e.
>
>
>
>

-- =[=]={=}=[=]={=}=[=]={=}=[=]={=}=[=]={=} Cyrus Shaoul http://www.psych.ualberta.ca/~westburylab/ University of Alberta 780-492-5843 =[=]={=}=[=]={=}=[=]={=}=[=]={=}=[=]={=}



More information about the Corpora mailing list