[Corpora-List] Local Similarity in LDA

Yashar Najafloo yasharnajafloo at yahoo.com
Thu May 19 00:40:27 CEST 2016

Hi there,

I have a question with regards to similarity between two words in LDA (Latent Dirichlet Allocation) and was wondering if anyone can kindly help me out. I'll try to keep it short.  I have a corpus and analysed it using LDA and Variational Inference. I now know how much documents are about different topics and how much each topic is about different words in my word list. I know the similarity between two words can be calculated by the amount of topic two share which is sum of (say 10 topics) conditional probability of word one given topic z multiplied in conditional probability of topic z given word two.P(w1|w2)=SUM (z=1 to 10) [P(w1|z)P(z|w2)] The question is how to calculate the similarity of two words in particular documents (we know how much the documents are about topics). I was thinking of taking the topic proportion of documents as weights, multiply in the topics given their weights and work out the above mentioned math. Is what I am trying to achieve mathematically correct? Regards,Yashar -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 2914 bytes Desc: not available URL: <https://mailman.uib.no/public/corpora/attachments/20160518/7deb8851/attachment.txt>

More information about the Corpora mailing list