[Corpora-List] Geometrical representation of NL phrases for similarity comparison

Ignacio J. Iacobacci iiacobac at gmail.com
Fri Oct 19 11:13:42 CEST 2018


Hello Alexander,

There are many options, much better that this one, but doc2vec, the extension of word2vec for sentences and documents will work for you https://radimrehurek.com/gensim/models/doc2vec.html

All the best!

Ignacio

El vie., 19 oct. 2018 a las 10:10, Alexander Osherenko (<osherenko at gmx.de>) escribió:


> Thanks, Mohammad. Unfortunately, I looking for a geometric representation
> of phrases, not of words.
>
> Best, Alexander
>
>
> Am Fr., 19. Okt. 2018 um 11:01 Uhr schrieb Mohammad Akbari <
> akbari.ma at gmail.com>:
>
>> Hello Alexander,
>>
>> Word embedding models, such as word2vec, and glove, are common
>> approaches; where words represented with a numerical vector (
>> https://arxiv.org/pdf/1310.4546.pdf,
>> https://code.google.com/archive/p/word2vec/). When you have word
>> embedding, you can do geometric computations based other vectors. A common
>> approach is to compute the average embedding of all words in a phrase; You
>> can check fasttext for this purpose.
>>
>>
>> Regards,
>> Mohammad
>>
>> On 19 Oct 2018, at 09:41, Alexander Osherenko <osherenko at gmx.de> wrote:
>>
>> Hi,
>>
>> I wonder if it is possible to represent NL phrases geometrically, for
>> example, to compare their similarity. For example, the phrase "Hey man,
>> that chick *is such a catch!*" and more formal "..., this girl is
>> pretty!" should be represented geometrically nearby because they are
>> semantically similar.
>>
>> I am aware of LSA vectors that represent particular words and similarity
>> could be evaluated as a distance between these word vectors in the LSA
>> space. However, the LSA approach only works for individual words and no
>> phrases and it is IMHO too numerical because it doesn't consider
>> semantics of participating words.
>>
>> Best, Alexander
>> --
>> Alexander Osherenko, Dr. rer. nat.
>> Senior HCI architect
>> Founder and R&D
>> Socioware Development <http://www.socioware.de/osherenko_page.html>
>> Profile: ResearchGate
>> <https://www.researchgate.net/profile/Alexander_Osherenko>
>> Implementing Social Smart Environments with a Large Number of Believable
>> Inhabitants in the Context of Globalization
>> <https://www.researchgate.net/publication/327425719_Implementing_Social_Smart_Environments_with_a_Large_Number_of_Believable_Inhabitants_in_the_Context_of_Globalization> at
>> Springer
>> _______________________________________________
>> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
>> Corpora mailing list
>> Corpora at uib.no
>> https://mailman.uib.no/listinfo/corpora
>>
>>
>> _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no
> https://mailman.uib.no/listinfo/corpora
>

-- Men who become accustomed to worrying about the needs of machines become callous about the needs of men (Isaac Asimov)

Ignacio J. Iacobacci iiacobac at gmail.com iiacobacci at dc.uba.ar iacobacci at di.uniroma1.it -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 6658 bytes Desc: not available URL: <https://mailman.uib.no/public/corpora/attachments/20181019/d64cf353/attachment.txt>



More information about the Corpora mailing list