[Corpora-List] SVD on high-dimension data

Christopher Manning manning at stanford.edu
Tue Mar 6 23:01:00 CET 2007


But actually InfoMap is using SVDPACKC internally....

The top-level answer to this is that you produce a restricted space
for the context vectors, and so really you do SVD on something like a
1 million by 5000 matrix.

Chris.


On Mar 6, 2007, at 7:38 AM, David Reitter wrote:


> Jamie,

>

> On 6 Mar 2007, at 14:59, Jamie Smith wrote:

>

>> I have large (1 million by 1 million) term-term matrices. What SVD

>> packages work with such massive datasets? I have tried Matlab and

>> SVDPACKC without much success.

>

> Have a look at Infomap,

>

> http://infomap-nlp.sourceforge.net/

> http://infomap.stanford.edu/

>

> we've used it successfully on the Aquaint and DUC2005 data (100+

> million words).

>

>

> --

> David Reitter

> ICCS/HCRC, Informatics, University of Edinburgh

> http://www.david-reitter.com









More information about the Corpora-archive mailing list