[Corpora-List] Finding representative terms

Chris Jordan cjordan at cs.dal.ca
Mon Dec 26 18:48:01 CET 2005


There is no such thing as an ideal term discrimination function
unfortunately however I would recommend trying something like relative
entropy. It is what I have used in the past with my thesis work on
automatically manufacturing queries. Cai et al also used relative and
other divergence functions for query expansion.

*@inproceedings*{Cai_query_expansion,
author = {D. Cai and C. J. van Rijsbergen and J. M. Jose},
title = {Automatic query expansion based on divergence},
booktitle = {CIKM '01: Proceedings of the Tenth International Conference on Information and Knowledge Management},
year = {2001},
isbn = {1-58113-436-3},
pages = {419--426},
location = {Atlanta, Georgia, USA},
doi = {http://doi.acm.org/10.1145/502585.502656},
publisher = {ACM Press},
}



Delip Rao wrote:


>Hi,

>

>Is there any work that tries to find the most

>important/representative words from a document? I have

>tried using IDF but results were very poor. Also IDF

>does not make sense if we have a single document and

>want to get the most important term(s) out of it.

>

>Thanks!

>Delip

>

>

>

>__________________________________

>Meet your soulmate!

>Yahoo! Asia presents Meetic - where millions of singles gather

>http://asia.yahoo.com/meetic

>

>

>

>






More information about the Corpora-archive mailing list