Research Commons
      • Browse 
        • Communities & Collections
        • Titles
        • Authors
        • By Issue Date
        • Subjects
        • Types
        • Series
      • Help 
        • About
        • Collection Policy
        • OA Mandate Guidelines
        • Guidelines FAQ
        • Contact Us
      • My Account 
        • Sign In
        • Register
      View Item 
      •   Research Commons
      • University of Waikato Theses
      • Masters Degree Theses
      • View Item
      •   Research Commons
      • University of Waikato Theses
      • Masters Degree Theses
      • View Item
      JavaScript is disabled for your browser. Some features of this site may not work without it.

      Contextualised approaches to embedding word senses

      Ansell, Alan John
      Thumbnail
      Files
      thesis.pdf
      1.018Mb
      Citation
      Export citation
      Ansell, A. J. (2020). Contextualised approaches to embedding word senses (Thesis, Master of Science (Research) (MSc(Research))). The University of Waikato, Hamilton, New Zealand. Retrieved from https://hdl.handle.net/10289/13564
      Permanent Research Commons link: https://hdl.handle.net/10289/13564
      Abstract
      Vector representations of text are an essential tool for modern Natural Language Processing (NLP), and there has been much work devoted to finding effective methods for obtaining such representations. Most previously proposed methods derive vector representations for individual words, known as word embeddings. While word embeddings have enabled considerable advances in NLP, they have a significant theoretical drawback: many words have several, often completely unrelated meanings; it seems dubious to conflate these multiple meanings into a single point in semantic space.

      This drawback has inspired an alternative, "multi-sense" approach to representing words. In this approach, rather than learning a single vector for each word, multiple vectors, or "sense embeddings," are learned corresponding to the individual meanings of the word. While this approach has not in general surpassed the word embedding approach, it has proved beneficial for a number of tasks such as word similarity estimation and word sense induction.

      One of the most significant recent advances in NLP has been the development of "contextualised" word embedding models. Whereas word embeddings model the semantic properties of words in isolation, contextualised models represent the meanings of words in context. This enables them to capture some of the vast array of linguistic phenomena that occur above the word level.

      I propose a number of new methods for learning sense embeddings which exploit contextualised techniques, based on the underlying hypothesis that the probability of a word occurring in a given context is equal to the sum of the probabilities of its individual senses occurring in the context. I first validate this hypothesis by using it to derive a simple method for learning sense embeddings inspired by the Skip-gram model. I then present a method for extracting sense embeddings from a contextualised word embedding model. Finally I propose an end-to-end model for learning sense embeddings, and show that it comprehensively outperforms previous sense embedding models on the task of word sense induction, a standard task for evaluation of such models. To demonstrate the model's flexibility I apply it to some other word-sense related tasks with good results.
      Date
      2020
      Type
      Thesis
      Degree Name
      Master of Science (Research) (MSc(Research))
      Supervisors
      Pfahringer, Bernhard
      Bravo-Marquez, Felipe
      Publisher
      The University of Waikato
      Rights
      All items in Research Commons are provided for private study and research purposes and are protected by copyright with all rights reserved unless otherwise indicated.
      Collections
      • Masters Degree Theses [2388]
      Show full item record  

      Usage

      Downloads, last 12 months
      63
       
       

      Usage Statistics

      For this itemFor all of Research Commons

      The University of Waikato - Te Whare Wānanga o WaikatoFeedback and RequestsCopyright and Legal Statement