[Corpora-List] software semantic similarity between texts

Antonio Toral antonio.toral at ilc.cnr.it
Mon Oct 20 16:56:57 CEST 2008

Thanks for the suggestion Scott.

I've tried from that website the "Pairwise Comparison" and it seems to do what I need; given some short texts I can get the "similarity" between each pair.

e.g. for sentences about "waterways" (glosses, definitions):

sentence1: "a navigable body of water" sentence2: "a conduit through which water flows" sentence3: "A waterway is any navigable body of water. These include rivers, lakes, oceans, and canals."

I get:

pairwise_comparison (sentence1, sentence2) = 0.78 pairwise_comparison (sentence1, sentence3) = 0.85 pairwise_comparison (sentence2, sentence3) = 0.82

However, in that website I find only on-line demos, whereas I'd need some software that I can download and integrate into a system. Do you know of any downloadable LSA package?

Regards, Antonio

> Latent Semantic Analysis should do the trick. There are a variety of tools
> on the website that should help you out.
> http://lsa.colorado.edu/
> Scott Crossley, Ph.D.
> Linguistics/TESOL
> Department of English
> Mississippi State University
> http://www.msstate.edu/dept/english/tesol/tesolfaculty.html
> (662) 325-2355
> Institute for Intelligent Systems
> University of Memphis
> http://mnemosyne.csl.psyc.memphis.edu/iis/

More information about the Corpora mailing list