I am trying to understand if the word-embeddings produced by one of the available method (say word2vec) follow a specific distribution in space.
More precisely: given a set of n-dim normalised embeddings produced from a corpus, is there any study that analysed the vectors distribution over the n-dimensional sphere of radius 1?
We cannot certainly assume a uniform distribution and it is more realistic a distribution driven by the small-world hypothesis/rule, but I am not aware of any precise result about this.
Any pointer is very appreciated and all the replies, if numerous, will be collected in a digest.
Many thanks
Best Fabio
----------------------------------------------------------------------------- Fabio Tamburini, PhD Associate Professor FICLIT - University of Bologna - ITALY E-mail: fabio.tamburini at unibo.it -----------------------------------------------------------------------------