[Corpora-List] Chiniese Name Gender Recognition

Heng Ji hengji at cs.nyu.edu
Thu Dec 22 19:11:00 CET 2005



I believe your IR idea will boost the performance. Besides, you may want
to try applying pronoun reference resolution before gender disambiguation.
Since Chinese person pronouns are distinguished clearly based on genders.
If you could link the pronoun in the context with the name candidate, that
might help. In addition a few gender-specific title words in the context
would be useful too.

I would guess only using lexical information can accurately recognize name
genders for people born before 1980; but might not be enough for those
names appearing later - many names have been given intentionally
gender-insensitive.:) So you may want to incorporate the time frame
information in your system.

Heng

On Thu, 22 Dec 2005, Jun Lang wrote:


> Hi Mark Lewellen,

> Thanks for your concerning about this problem.

> Yes. After doing some baseline research, I found there were many

>related problems about the gender recognition based on Chinese Name. May

>be using only Name could not achieve better result. I am considering

>combining some other resource for disambiguation the gender. For example,

>I could use some search engine for some gender designing word to enhance

>the final accuracy.

> How do you think about it?

>

> Thanks!

>

> May you nice Christmas Eve and Day!

>

> Best wishes,

> Bill_Lang(Jun Lang): Ph.D Candidate

> Information Retrieval Laboratory

> Harbin Institute of Technology

> Mail: bill_lang at gmail.com

> Homepage: http://ir.hit.edu.cn/~bill_lang

>

>

> -----Original Message-----

> From: Mark Lewellen [mailto:lewellen at erols.com]

> Sent: Wednesday, December 21, 2005 11:49 PM

> To: 'Jun Lang'; 'Xiaofei Lu'

> Cc: corpora at uib.no

> Subject: RE: [Corpora-List] Chiniese Name Gender Recognition

>

> Since Chinese given names are not limited to a set of

> lexical items that are prototypically 'names' (i.e. they

> can be just about any lexical item), Chinese given names,

> as you probably know, often have no clue about gender.

> There has been some discussion on 'traits' that are

> more feminine or masculine and would be reflected in names,

> but there remains a lot of ambiguity. I doubt there is any

> statistical method, algorithm, or even native speaker that

> can make up for that problem!

>

> Mark Lewellen

>

>> -----Original Message-----

>> From: owner-corpora at lists.uib.no

>> [mailto:owner-corpora at lists.uib.no] On Behalf Of Jun Lang

>> Sent: Tuesday, December 13, 2005 7:31 AM

>> To: 'Xiaofei Lu'

>> Cc: corpora at uib.no

>> Subject: [Corpora-List] 答复: [Corpora-List] Chiniese Name

>> Gender Recognition

>>

>>

>> Yeah! There are many names which could be used for mail and

>> female. It is a

>> difficult problem. Now I have done some simple research on this topic.

>> Recently, I am trying to get more and more data. Since the

>> parameter space

>> is very huge, decision trees can not get the final result

>> quickly. I want to

>> use Bayes Model again.

>>

>> Can you give me some ideas about it? Thanks a lot!

>>

>> Best wishes,

>> Jun Lang

>>

>> -----邮件原件-----

>> 发件人: Xiaofei Lu [mailto:xflu at ling.ohio-state.edu]

>> 发送时间: 2005年12月13日 13:56

>> 收件人: Jun Lang

>> 主题: Re: [Corpora-List] Chiniese Name Gender Recognition

>>

>> Interesting. What is and how do you establish the baseline?

>> Many names can

>> be either male or female, can't they?

>>

>> On Tue, 13 Dec 2005, Jun Lang wrote:

>>

>>> Hi all Corpora Members,

>>>

>>> Now I am studying on Chinese Name Gender Recognition.

>> The input is a

>>> Chinese name. The output is the corresponding gender. I

>> used decision

>> trees

>>> method. But finally, the accuracy is only about 70%.

>>>

>>> Do you know any other method which can achieve higher

>> accuracy? And is

>>> there somebody has done any similar research?

>>>

>>> Thanks a lot!

>>>

>>>

>>>

>>> Best wishes,

>>>

>>> Bill_Lang(Jun Lang): Ph.D Candidate

>>>

>>> Information Retrieval Laboratory

>>>

>>> Harbin Institute of Technology

>>>

>>> Mail: bill_lang at gmail.com

>>>

>>> Homepage: http://ir.hit.edu.cn/~bill_lang

>>>

>>>

>>

>

>

>

>

>



More information about the Corpora-archive mailing list