[Corpora-List] Geometrical representation of NL phrases for similarity comparison

Alexander Osherenko osherenko at gmx.de
Fri Oct 19 11:04:25 CEST 2018


Thanks, Mohammad. Unfortunately, I looking for a geometric representation of phrases, not of words.

Best, Alexander

Am Fr., 19. Okt. 2018 um 11:01 Uhr schrieb Mohammad Akbari < akbari.ma at gmail.com>:


> Hello Alexander,
>
> Word embedding models, such as word2vec, and glove, are common approaches;
> where words represented with a numerical vector (
> https://arxiv.org/pdf/1310.4546.pdf,
> https://code.google.com/archive/p/word2vec/). When you have word
> embedding, you can do geometric computations based other vectors. A common
> approach is to compute the average embedding of all words in a phrase; You
> can check fasttext for this purpose.
>
>
> Regards,
> Mohammad
>
> On 19 Oct 2018, at 09:41, Alexander Osherenko <osherenko at gmx.de> wrote:
>
> Hi,
>
> I wonder if it is possible to represent NL phrases geometrically, for
> example, to compare their similarity. For example, the phrase "Hey man,
> that chick *is such a catch!*" and more formal "..., this girl is
> pretty!" should be represented geometrically nearby because they are
> semantically similar.
>
> I am aware of LSA vectors that represent particular words and similarity
> could be evaluated as a distance between these word vectors in the LSA
> space. However, the LSA approach only works for individual words and no
> phrases and it is IMHO too numerical because it doesn't consider
> semantics of participating words.
>
> Best, Alexander
> --
> Alexander Osherenko, Dr. rer. nat.
> Senior HCI architect
> Founder and R&D
> Socioware Development <http://www.socioware.de/osherenko_page.html>
> Profile: ResearchGate
> <https://www.researchgate.net/profile/Alexander_Osherenko>
> Implementing Social Smart Environments with a Large Number of Believable
> Inhabitants in the Context of Globalization
> <https://www.researchgate.net/publication/327425719_Implementing_Social_Smart_Environments_with_a_Large_Number_of_Believable_Inhabitants_in_the_Context_of_Globalization> at
> Springer
> _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no
> https://mailman.uib.no/listinfo/corpora
>
>
>
-------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 4826 bytes Desc: not available URL: <https://mailman.uib.no/public/corpora/attachments/20181019/e95af62d/attachment.txt>



More information about the Corpora mailing list