Research Commons
      • Browse 
        • Communities & Collections
        • Titles
        • Authors
        • By Issue Date
        • Subjects
        • Types
        • Series
      • Help 
        • About
        • Collection Policy
        • OA Mandate Guidelines
        • Guidelines FAQ
        • Contact Us
      • My Account 
        • Sign In
        • Register
      View Item 
      •   Research Commons
      • University of Waikato Research
      • Arts and Social Sciences
      • Arts and Social Sciences Papers
      • View Item
      •   Research Commons
      • University of Waikato Research
      • Arts and Social Sciences
      • Arts and Social Sciences Papers
      • View Item
      JavaScript is disabled for your browser. Some features of this site may not work without it.

      Māori loanwords: a corpus of New Zealand English tweets

      Trye, David; Calude, Andreea S.; Bravo-Marquez, Felipe; Keegan, Te Taka Adrian Gregory
      Thumbnail
      Files
      loanwords2019.pdf
      191.0Kb
      Citation
      Export citation
      Trye, D., Calude, A. S., Bravo-Marquez, F., & Keegan, T. T. A. G. (2019). Māori loanwords: a corpus of New Zealand English tweets. Presented at the Vocab@Leuven 2019, Florence, italy.
      Permanent Research Commons link: https://hdl.handle.net/10289/13040
      Abstract
      Māori loanwords are widely used in New Zealand English for various social functions by New Zealanders within and outside of the Māori community. Motivated by the lack of linguistic resources for studying how Māori loanwords are used in social media, we present a new corpus of New Zealand English tweets. We collected tweets containing selected Māori words that are likely to be known by New Zealanders who do not speak Māori. Since over 30% of these words turned out to be irrelevant (e.g., mana is a popular gaming term, Moana is a character from a Disney movie), we manually annotated a sample of our tweets into relevant and irrelevant categories. This data was used to train machine learning models to automatically filter out irrelevant tweets.
      Date
      2019
      Type
      Conference Contribution
      Collections
      • Computing and Mathematical Sciences Papers [1455]
      • Arts and Social Sciences Papers [1424]
      Show full item record  

      Usage

      Downloads, last 12 months
      81
       
       

      Usage Statistics

      For this itemFor all of Research Commons

      The University of Waikato - Te Whare Wānanga o WaikatoFeedback and RequestsCopyright and Legal Statement