Multinomial naive Bayes for text categorization revisited

Kibriya, Ashraf Masood; Frank, Eibe; Pfahringer, Bernhard; Holmes, Geoffrey

Item

Multinomial naive Bayes for text categorization revisited

Kibriya, Ashraf Masood
;
Frank, Eibe
;
Pfahringer, Bernhard
;
Holmes, Geoffrey

Abstract

This paper presents empirical results for several versions of the multinomial naive Bayes classifier on four text categorization problems, and a way of improving it using locally weighted learning. More specifically, it compares standard multinomial naive Bayes to the recently proposed transformed weight-normalized complement naive Bayes classifier (TWCNB) [1], and shows that some of the modifications included in TWCNB may not be necessary to achieve optimum performance on some datasets. However, it does show that TFIDF conversion and document length normalization are important. It also shows that support vector machines can, in fact, sometimes very significantly outperform both methods. Finally, it shows how the performance of multinomial naive Bayes can be improved using locally weighted learning. However, the overall conclusion of our paper is that support vector machines are still the method of choice if the aim is to maximize accuracy.

Type

Conference Contribution

Citation

Kibriya, A. M., Frank, E., Pfahringer, B. & Holmes, G. (2005). Multinomial naive Bayes for text categorization revisited. In G.I. Webb & Xinghuo Yu(Eds.), Proceedings of 17th Australian Joint Conference on Artificial Intelligence, Cairns, Australia, December 4-6, 2004.(pp. 488-499). Berlin: Springer.

Date

2005

Publisher

Springer

Multinomial naive Bayes for text categorization revisited

Kibriya, Ashraf Masood
;
Frank, Eibe
;
Pfahringer, Bernhard
;
Holmes, Geoffrey

Abstract

Type

Type of thesis

Series

Citation

Date

Publisher

Degree

Supervisors

Rights

Permanent link

DOI

Publisher version

Collections