[Corpora-List] Gender dataset

John D Burger john at mitre.org
Fri Apr 13 15:45:12 CEST 2012

kiran wrote:

> Is there any gender dataset available?
> It should ideally be a first name-gender mapping
> Ex: Abraham-Male or Abraham_Lincoln-Male

There are the name lists from the US 1990 Census, which have been used in a lot of language research, I believe:


These comprise three files: male given names, female given names, and surnames, each with frequency information. From the first two files, you could construct a gender distribution for each given name.

- John Burger


More information about the Corpora mailing list