Details about the post We are looking for a postdoctoral research associate to work on the EPSRC funded project “Encyclopedic Lexical Representations for Natural Language Processing (ELEXIR)”. The aim of this project is to learn vector space embeddings that capture fine-grained knowledge about concepts. Different from existing approaches, these representations will explicitly represent the properties of, and relationships between concepts. Vectors in the proposed framework will thus intuitively play the role of facts, about which we can reason in a principled way. More details about this post can be found at:
Background about the ELEXIR project The field of Natural Language Processing (NLP) has made unprecedented progress over the last decade, but the extent to which NLP systems “understand” language is still remarkably limited. A key underlying problem is the need for a vast amount of world knowledge. In this project, we focus on conceptual knowledge, and more in particular on:
(i) capturing what properties are associated with a given concept (e.g. lions are dangerous, boats can float); (ii) characterising how different concepts are related (e.g. brooms are used for cleaning, bees produce honey).
Our proposed approach relies on the fact that Wikipedia contains a wealth of such knowledge. Unfortunately, however, important properties and relationships are often not explicitly mentioned in text, especially if they follow straightforwardly from other information for a human reader (e.g. if X is an animal that can fly then X probably has wings). Apart from learning to extract knowledge expressed in text, we thus also have to learn how to reason about conceptual knowledge.
A central question is how conceptual knowledge should be represented and incorporated in language model architectures. Current NLP systems heavily rely on vector representations in which each concept is represented by a single vector. This approach has important theoretical limitations in terms of what knowledge can be captured, and it only allows for shallow forms of reasoning. In contrast, in symbolic AI, conceptual knowledge is typically represented using facts and rules. This enables powerful forms of reasoning, but symbolic representations are harder to learn and to use in neural networks.
The solution we propose relies on a novel hybrid representation framework, which combines the main advantages of vector representations with those of symbolic methods. In particular, we will explicitly represent properties and relationships, as in symbolic frameworks, but these properties and relations will be encoded as vectors. Each concept will thus be associated with several property vectors, while pairs of related concepts will be associated with one or more relation vectors. Our vectors will thus intuitively play the same role that facts play in symbolic frameworks, with associated neural network models then playing the role of rules. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 6641 bytes Desc: not available URL: <https://mailman.uib.no/public/corpora/attachments/20210422/034d48ff/attachment.txt>