[Corpora-List] Tool/program to estimate percent of English in a given text file?

Tristan Purvis tristan.purvis at aun.edu.ng
Fri Nov 24 22:54:36 CET 2017


​​ Hello,

Quick version: Are there any publicly available tools or program modules I could use to estimate the percent of English that is found in a given sample of bilingual/multilingual text?

In a study that includes looking at instances of code-switching (to English words) for certain lexical items whose distribution and usage I'll be tracking, I want to keep track of a given speaker's overall tendency for mixing in English. It's not a high priority as a formal variable, so if it's too time consuming to pursue, I'll be inclined to drop it, but it seems like there might be some ready-made tool in the language detection field that might incidentally serve my purposes ... Can anyone point me to a tool or quick solution that can calculate an estimate of the percent of English found in a given text sample?

(Note: I only have 50-60 speakers to apply this too, so I can feasibly run each one by one into a tool that can measure this. That is, I don't necessarily need a tool that can run this in batches, though obviously that would be an nice added convenience.)

Thanks in advance, Tristan ​ ========================== Mohamed Tristan Purvis, PhD Assistant Professor, School of Arts & Sciences American University of Nigeria ​https://sites.google.com/site/tristanpurvis/curriculum-vitae ​ -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 2804 bytes Desc: not available URL: <https://www.uib.no/mailman/public/corpora/attachments/20171124/842395ea/attachment.txt>



More information about the Corpora mailing list