[Corpora-List] Chiniese Name Gender Recognition

Xiaofei Lu xflu at ling.ohio-state.edu
Thu Dec 22 20:16:01 CET 2005


Are you planning to look at context at all? The pronoun resolution idea
should definitely help. Plus, looking at the context in which a personal
name appears may help a bit, too, e.g., in cases where one or more names
appears after things like "member(s) of the women's team", etc.

Xiaofei


On Thu, 22 Dec 2005, Heng Ji wrote:


>

> I believe your IR idea will boost the performance. Besides, you may want to

> try applying pronoun reference resolution before gender disambiguation.

> Since Chinese person pronouns are distinguished clearly based on genders. If

> you could link the pronoun in the context with the name candidate, that

> might help. In addition a few gender-specific title words in the context

> would be useful too.

>

> I would guess only using lexical information can accurately recognize name

> genders for people born before 1980; but might not be enough for those

> names appearing later - many names have been given intentionally

> gender-insensitive.:) So you may want to incorporate the time frame

> information in your system.

>

> Heng

>

> On Thu, 22 Dec 2005, Jun Lang wrote:

>

>> Hi Mark Lewellen,

>> Thanks for your concerning about this problem.

>> Yes. After doing some baseline research, I found there were many

>> related problems about the gender recognition based on Chinese Name. May be

>> using only Name could not achieve better result. I am considering

>> combining some other resource for disambiguation the gender. For example, I

>> could use some search engine for some gender designing word to enhance the

>> final accuracy.

>> How do you think about it?

>>

>> Thanks!

>>

>> May you nice Christmas Eve and Day!

>>

>> Best wishes,

>> Bill_Lang(Jun Lang): Ph.D Candidate

>> Information Retrieval Laboratory

>> Harbin Institute of Technology

>> Mail: bill_lang at gmail.com

>> Homepage: http://ir.hit.edu.cn/~bill_lang

>>

>>

>> -----Original Message-----

>> From: Mark Lewellen [mailto:lewellen at erols.com]

>> Sent: Wednesday, December 21, 2005 11:49 PM

>> To: 'Jun Lang'; 'Xiaofei Lu'

>> Cc: corpora at uib.no

>> Subject: RE: [Corpora-List] Chiniese Name Gender Recognition

>>

>> Since Chinese given names are not limited to a set of

>> lexical items that are prototypically 'names' (i.e. they

>> can be just about any lexical item), Chinese given names,

>> as you probably know, often have no clue about gender.

>> There has been some discussion on 'traits' that are

>> more feminine or masculine and would be reflected in names,

>> but there remains a lot of ambiguity. I doubt there is any

>> statistical method, algorithm, or even native speaker that

>> can make up for that problem!

>>

>> Mark Lewellen

>>

>>> -----Original Message-----

>>> From: owner-corpora at lists.uib.no

>>> [mailto:owner-corpora at lists.uib.no] On Behalf Of Jun Lang

>>> Sent: Tuesday, December 13, 2005 7:31 AM

>>> To: 'Xiaofei Lu'

>>> Cc: corpora at uib.no

>>> Subject: [Corpora-List] 答复: [Corpora-List] Chiniese Name

>>> Gender Recognition

>>>

>>>

>>> Yeah! There are many names which could be used for mail and

>>> female. It is a

>>> difficult problem. Now I have done some simple research on this topic.

>>> Recently, I am trying to get more and more data. Since the

>>> parameter space

>>> is very huge, decision trees can not get the final result

>>> quickly. I want to

>>> use Bayes Model again.

>>>

>>> Can you give me some ideas about it? Thanks a lot!

>>>

>>> Best wishes,

>>> Jun Lang

>>>

>>> -----邮件原件-----

>>> 发件人: Xiaofei Lu [mailto:xflu at ling.ohio-state.edu]

>>> 发送时间: 2005年12月13日 13:56

>>> 收件人: Jun Lang

>>> 主题: Re: [Corpora-List] Chiniese Name Gender Recognition

>>>

>>> Interesting. What is and how do you establish the baseline?

>>> Many names can

>>> be either male or female, can't they?

>>>

>>> On Tue, 13 Dec 2005, Jun Lang wrote:

>>>

>>>> Hi all Corpora Members,

>>>>

>>>> Now I am studying on Chinese Name Gender Recognition.

>>> The input is a

>>>> Chinese name. The output is the corresponding gender. I

>>> used decision

>>> trees

>>>> method. But finally, the accuracy is only about 70%.

>>>>

>>>> Do you know any other method which can achieve higher

>>> accuracy? And is

>>>> there somebody has done any similar research?

>>>>

>>>> Thanks a lot!

>>>>

>>>>

>>>>

>>>> Best wishes,

>>>>

>>>> Bill_Lang(Jun Lang): Ph.D Candidate

>>>>

>>>> Information Retrieval Laboratory

>>>>

>>>> Harbin Institute of Technology

>>>>

>>>> Mail: bill_lang at gmail.com

>>>>

>>>> Homepage: http://ir.hit.edu.cn/~bill_lang

>>>>

>>>>

>>>

>>

>>

>>

>>

>



More information about the Corpora-archive mailing list