Contextualised approaches to embedding word senses
Ansell, A. J. (2020). Contextualised approaches to embedding word senses (Thesis, Master of Science (Research) (MSc(Research))). The University of Waikato, Hamilton, New Zealand. Retrieved from https://hdl.handle.net/10289/13564
Permanent Research Commons link: https://hdl.handle.net/10289/13564
Vector representations of text are an essential tool for modern Natural Language Processing (NLP), and there has been much work devoted to finding effective methods for obtaining such representations. Most previously proposed methods derive vector representations for individual words, known as word embeddings. While word embeddings have enabled considerable advances in NLP, they have a significant theoretical drawback: many words have several, often completely unrelated meanings; it seems dubious to conflate these multiple meanings into a single point in semantic space. This drawback has inspired an alternative, "multi-sense" approach to representing words. In this approach, rather than learning a single vector for each word, multiple vectors, or "sense embeddings," are learned corresponding to the individual meanings of the word. While this approach has not in general surpassed the word embedding approach, it has proved beneficial for a number of tasks such as word similarity estimation and word sense induction. One of the most significant recent advances in NLP has been the development of "contextualised" word embedding models. Whereas word embeddings model the semantic properties of words in isolation, contextualised models represent the meanings of words in context. This enables them to capture some of the vast array of linguistic phenomena that occur above the word level. I propose a number of new methods for learning sense embeddings which exploit contextualised techniques, based on the underlying hypothesis that the probability of a word occurring in a given context is equal to the sum of the probabilities of its individual senses occurring in the context. I first validate this hypothesis by using it to derive a simple method for learning sense embeddings inspired by the Skip-gram model. I then present a method for extracting sense embeddings from a contextualised word embedding model. Finally I propose an end-to-end model for learning sense embeddings, and show that it comprehensively outperforms previous sense embedding models on the task of word sense induction, a standard task for evaluation of such models. To demonstrate the model's flexibility I apply it to some other word-sense related tasks with good results.
The University of Waikato
All items in Research Commons are provided for private study and research purposes and are protected by copyright with all rights reserved unless otherwise indicated.
- Masters Degree Theses