There have been a number of papers on various tree kernels and path kernels (easily found by searching). Each parse tree is mapped to a high-dimensional vector that records the counts of various substructures such as complete and incomplete subtrees, subcategorization frames, and/or dependency paths. The similarity of two trees is then defined as the dot product of their vectors. This dot product can typically be found efficiently by dynamic programming over the pair of trees, without having to expand out the actual high-dimensional vector for each tree. (An instance of the "kernel trick.")
Alternatively for an asymmetric measure, see work on quasi-synchronous grammar, e.g., "What is the Jeopardy Model? A Quasi-Synchronous Grammar for QA" by Mengqiu Wang, Noah A. Smith, and Teruko Mitamura (EMNLP 2007). http://www.cs.cmu.edu/~nasmith/papers/wang+smith+mitamura.emnlp07.pdf
Most of these methods can be extended naturally to work efficiently over packed forests of parse trees, so that you don't have to commit to a single parse tree for each sentence.
-cheers, jason
On Sat, Nov 22, 2008 at 6:33 AM, Paul McNamee <paul.mcnamee at jhuapl.edu> wrote:
> Cui et al. had a paper at SIGIR 2005, "Question Answering Passage Retrieval
> Using Dependency Relations":
> http://doi.acm.org/10.1145/1076034.1076103
>
> They looked for sentences that might contain an answer to a question
> for experiments in question answering at TREC. And I believe some of
> their source code was made publicly available.
>
> You might also find some relevant work from the RTE evaluations:
> http://www.nist.gov/tac/tracks/2008/rte/
>
> - Paul