Some low-frequency words are known to everyone (toolbar, screenshot, soulmate, uppercase, hoodie), whereas others are not known at all (scourage, thunk, whicker, or caudle). This is a reason why word frequency is only a proxy for word knowledge, a problem known already for a long time in the L2 literature. We have now collected direct estimates of the number of people who know various words (for a total of 61,500 lemmas), a variable we call word prevalence.
Uses of the new prevalence measure are:
- Estimates of word difficulty for vocabulary tests
- Estimates of word difficulty for word learning experiments and word processing experiments
- Estimates of text difficulty
- Estimates of the size of the lexicon at various stages
I am sure more uses will follow.
You find the article (in press in Behavior Research Methods) and the word prevalence measure on the webpage http://crr.ugent.be/archives/2045.
Marc Brysbaert [cid:image001.png at 01D218D4.30C18D10] Marc Brysbaert Department of Experimental Psychology Ghent University Henri Dunantlaan 2 B-9000 Gent Belgium Tel. +32 9 264 94 25 Fax. +32 9 264 64 96 E-mal: marc.brysbaert at ugent.be<mailto:marc.brysbaert at ugent.be> Website: http://crr.ugent.be/members/marc-brysbaert
-------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 9716 bytes Desc: not available URL: <https://mailman.uib.no/public/corpora/attachments/20180621/f4b87916/attachment.txt> -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 8481 bytes Desc: image001.png URL: <https://mailman.uib.no/public/corpora/attachments/20180621/f4b87916/attachment.png>