Research Commons
      • Browse 
        • Communities & Collections
        • Titles
        • Authors
        • By Issue Date
        • Subjects
        • Types
        • Series
      • Help 
        • About
        • Collection Policy
        • OA Mandate Guidelines
        • Guidelines FAQ
        • Contact Us
      • My Account 
        • Sign In
        • Register
      View Item 
      •   Research Commons
      • University of Waikato Research
      • Computing and Mathematical Sciences
      • Computing and Mathematical Sciences Papers
      • View Item
      •   Research Commons
      • University of Waikato Research
      • Computing and Mathematical Sciences
      • Computing and Mathematical Sciences Papers
      • View Item
      JavaScript is disabled for your browser. Some features of this site may not work without it.

      From unlabelled tweets to Twitter-specific opinion words

      Bravo-Marquez, Felipe; Frank, Eibe; Pfahringer, Bernhard
      Thumbnail
      Files
      SIGIR paper.pdf
      Accepted version, 188.7Kb
      DOI
       10.1145/2766462.2767770
      Find in your library  
      Citation
      Export citation
      Bravo-Márquez, F., Frank, E., & Pfahringer, B. (2015). From unlabelled tweets to Twitter-specific opinion words. In Proc 38th International ACM SIGIR Conference on Research and Development, Santiago de Chile, 9-13 August 2015 (pp. 743–746). New York, USA: ACM. http://doi.org/10.1145/2766462.2767770
      Permanent Research Commons link: https://hdl.handle.net/10289/9567
      Abstract
      In this article, we propose a word-level classification model for automatically generating a Twitter-specific opinion lexicon from a corpus of unlabelled tweets. The tweets from the corpus are represented by two vectors: a bag-of-words vector and a semantic vector based on word-clusters. We propose a distributional representation for words by treating them as the centroids of the tweet vectors in which they appear. The lexicon generation is conducted by training a word-level classifier using these centroids to form the instance space and a seed lexicon to label the training instances. Experimental results show that the two types of tweet vectors complement each other in a statistically significant manner and that our generated lexicon produces significant improvements for tweet-level polarity classification.
      Date
      2015
      Type
      Conference Contribution
      Publisher
      ACM
      Rights
      © 2015 Copyright is held by the author(s). Publication rights licensed to ACM
      Collections
      • Computing and Mathematical Sciences Papers [1455]
      Show full item record  

      Usage

      Downloads, last 12 months
      55
       
       
       

      Usage Statistics

      For this itemFor all of Research Commons

      The University of Waikato - Te Whare Wānanga o WaikatoFeedback and RequestsCopyright and Legal Statement