Research Commons
      • Browse 
        • Communities & Collections
        • Titles
        • Authors
        • By Issue Date
        • Subjects
        • Types
        • Series
      • Help 
        • About
        • Collection Policy
        • OA Mandate Guidelines
        • Guidelines FAQ
        • Contact Us
      • My Account 
        • Sign In
        • Register
      View Item 
      •   Research Commons
      • University of Waikato Research
      • Computing and Mathematical Sciences
      • Computing and Mathematical Sciences Papers
      • View Item
      •   Research Commons
      • University of Waikato Research
      • Computing and Mathematical Sciences
      • Computing and Mathematical Sciences Papers
      • View Item
      JavaScript is disabled for your browser. Some features of this site may not work without it.

      Annotate-Sample-Average (ASA): A New Distant Supervision Approach for Twitter Sentiment Analysis

      Bravo-Marquez, Felipe; Frank, Eibe; Pfahringer, Bernhard
      Thumbnail
      Files
      asa_paper.pdf
      Accepted version, 237.7Kb
      DOI
       10.3233/978-1-61499-672-9-498
      Link
       www.cs.waikato.ac.nz
      Find in your library  
      Citation
      Export citation
      Bravo-Marquez, F., Frank, E., & Pfahringer, B. (2016). Annotate-Sample-Average (ASA): A New Distant Supervision Approach for Twitter Sentiment Analysis. In G. Kaminka, M. Fox, P. Bouquet, E. Hullermeier, V. Dignum, F. Dignum, & F. VanHarmelen (Eds.), ECAI 2016: 22ND EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE (Vol. 285, pp. 498–506). Hague, NETHERLANDS: IOS Press. http://doi.org/10.3233/978-1-61499-672-9-498
      Permanent Research Commons link: https://hdl.handle.net/10289/10753
      Abstract
      The classification of tweets into polarity classes is a popular task in sentiment analysis. State-of-the-art solutions to this problem are based on supervised machine learning models trained from manually annotated examples. A drawback of these approaches is the high cost involved in data annotation. Two freely available resources that can be exploited to solve the problem are: 1) large amounts of unlabelled tweets obtained from the Twitter API and 2) prior lexical knowledge in the form of opinion lexicons. In this paper, we propose Annotate-Sample-Average (ASA), a distant supervision method that uses these two resources to generate synthetic training data for Twitter polarity classification. Positive and negative training instances are generated by sampling and averaging unlabelled tweets containing words with the corresponding polarity. Polarity of words is determined from a given polarity lexicon. Our experimental results show that the training data generated by ASA (after tuning its parameters) produces a classifier that performs significantly better than a classifier trained from tweets annotated with emoticons and a classifier trained, without any sampling and averaging, from tweets annotated according to the polarity of their words.
      Date
      2016-01-01
      Type
      Conference Contribution
      Publisher
      IOS Press
      Rights
      © 2016 The Authors and IOS Press. This article is published online with Open Access by IOS Press and distributed under the terms of the Creative Commons Attribution Non-Commercial License 4.0 (CC BY-NC 4.0).
      Collections
      • Computing and Mathematical Sciences Papers [1454]
      Show full item record  

      Usage

      Downloads, last 12 months
      68
       
       
       

      Usage Statistics

      For this itemFor all of Research Commons

      The University of Waikato - Te Whare Wānanga o WaikatoFeedback and RequestsCopyright and Legal Statement