[Corpora-List] Comparing word lengths
Muhammad Shakir Aziz
true.friend2004 at gmail.com
Tue Jan 3 13:18:22 CET 2017
Dear Corpora Members
I am dealing with online conversational texts which contain a lot of short
hand spellings. I have normalized these spellings (longer standard
spellings like brother for bro) or (short standard spellings like so for
sooooooooo). Since word length is an important variable for my analysis, I
just want to make sure that there is no significant /overall difference
between normalized and non-normalized texts. The question: is it OK to
simply compare mean word lengths from each corpus category? Or should I put
mean score from each file in two columns (normalized versus non-normalized)
and apply some significance test?
PS: My guess is that about 10% words (at maximum) are affected by this
normalization process, but I just wanted to make sure it is negligible.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Size: 884 bytes
Desc: not available
More information about the Corpora