From unlabelled tweets to Twitter-specific opinion words

Bravo-Marquez, Felipe; Frank, Eibe; Pfahringer, Bernhard

Item

From unlabelled tweets to Twitter-specific opinion words

Bravo-Marquez, Felipe
;
Frank, Eibe
;
Pfahringer, Bernhard

Abstract

In this article, we propose a word-level classification model for automatically generating a Twitter-specific opinion lexicon from a corpus of unlabelled tweets. The tweets from the corpus are represented by two vectors: a bag-of-words vector and a semantic vector based on word-clusters. We propose a distributional representation for words by treating them as the centroids of the tweet vectors in which they appear. The lexicon generation is conducted by training a word-level classifier using these centroids to form the instance space and a seed lexicon to label the training instances. Experimental results show that the two types of tweet vectors complement each other in a statistically significant manner and that our generated lexicon produces significant improvements for tweet-level polarity classification.

Type

Conference Contribution

Citation

Bravo-Márquez, F., Frank, E., & Pfahringer, B. (2015). From unlabelled tweets to Twitter-specific opinion words. In Proc 38th International ACM SIGIR Conference on Research and Development, Santiago de Chile, 9-13 August 2015 (pp. 743–746). New York, USA: ACM. http://doi.org/10.1145/2766462.2767770

Date

2015

Publisher

ACM

Rights

From unlabelled tweets to Twitter-specific opinion words

Bravo-Marquez, Felipe
;
Frank, Eibe
;
Pfahringer, Bernhard

Abstract

Type

Type of thesis

Series

Citation

Date

Publisher

Degree

Supervisors

Rights

Files

Permanent link

DOI

Publisher version

Collections