[Corpora-List] automatic definition construction/retrieval

Mark Sanderson m.sanderson at sheffield.ac.uk
Fri Sep 2 20:02:00 CEST 2005

Here's another one:

Joho, H. & Sanderson, M. (2000) Retrieving Descriptive Phrases from
Large Amounts of Free Text, in proceedings of the 9th ACM Conference
on Information and Knowledge Management, Pages 180-186

You can download it from here


At 14:28 02/09/2005, Yannick Versley wrote:

> > I wonder if anybody could provide me with some advice related to tools

> > or references related to the:

> >

> > 1) automatic construction of dictionary-like definitions of terms

> > extracted automatically from free text articles, or alternatively

> > 2) the retrieval of sentences which are likely to describe definitions

> > of terms within documents would also be appreciated.

>I think this has much in common with the problem of answering definition

>questions in question answering. I would think that

>Hildebrandt/Katz/Lin (2003): Answering Definition Questions

> Using Multiple Knowledge Sources


>could be a good starting point.


>If you want to get something as in your example:

> > [HpaB] : HpaB is a protein which promotes the secretion of a large set

> > of effector proteins and prevents the delivery of non-effectors into the

> > plant cell.

>a good start could be to use a list of upper-level terms like "protein",

>"amino acid" etc. as well as action verbs and then scan the text for patterns


>HpaB, [NP], [NP] and other [Hypernym-NP]s

>as well as

>HpaB [VP [action verb] ...]

>For reference, see e.g.

>Hearst, M.(1992): Automatic Acquisition of Hyponyms from Large Text Corpora

>(see http://www.sims.berkeley.edu/~hearst/publications.html)


>R. Girju, A. Badulescu, and D. Moldovan (2000):

>Learning Semantic Constraints for the Automatic Discovery of Part-Whole


>R. Girju(2003):

>Automatic Detection of Causal Relations for Question Answering

>(see Automatic Detection of Causal Relations for Question Answering)


>Kind regards,

>Yannick Versley

More information about the Corpora-archive mailing list