Naïve Bayes for text classification with unbalanced classes

dc.contributor.authorFrank, Eibe
dc.contributor.authorBouckaert, Remco R.
dc.date.accessioned2008-11-21T01:46:23Z
dc.date.available2008-11-21T01:46:23Z
dc.date.issued2006
dc.description.abstractMultinomial naive Bayes (MNB) is a popular method for document classification due to its computational efficiency and relatively good predictive performance. It has recently been established that predictive performance can be improved further by appropriate data transformations [1,2]. In this paper we present another transformation that is designed to combat a potential problem with the application of MNB to unbalanced datasets. We propose an appropriate correction by adjusting attribute priors. This correction can be implemented as another data normalization step, and we show that it can significantly improve the area under the ROC curve. We also show that the modified version of MNB is very closely related to the simple centroid-based classifier and compare the two methods empirically.en_US
dc.identifier.citationFrank, E. & Bouckaert, R. R. (2006). Naïve Bayes for text classification with unbalanced classes. In J. Fürnkranz, T. Scheffer, & M. Spiliopoulou(Eds.), Proceedings of 10th European Conference on Principles and Practice of Knowledge Discovery in Databases Berlin, Germany, September 18-22, 2006(pp. 503-510). Berlin: Springer.en_US
dc.identifier.doi10.1007/11871637_49en_US
dc.identifier.urihttps://hdl.handle.net/10289/1442
dc.language.isoen
dc.publisherSpringer, Berlinen_US
dc.relation.urihttp://www.springerlink.com/content/f2728072350t3465/en_US
dc.subjectcomputer scienceen_US
dc.subjectNaive Bayesen_US
dc.subjecttext classificationen_US
dc.subjectMachine learning
dc.titleNaïve Bayes for text classification with unbalanced classesen_US
dc.typeConference Contributionen_US
dspace.entity.typePublication

Files

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.79 KB
Format:
Item-specific license agreed upon to submission
Description: