Naive Bayes for regression

dc.contributor.authorFrank, Eibeen_NZ
dc.contributor.authorTrigg, Leonard E.en_NZ
dc.contributor.authorHolmes, Geoffreyen_NZ
dc.contributor.authorWitten, Ian H.en_NZ
dc.date.accessioned2024-01-15T23:47:37Z
dc.date.available2024-01-15T23:47:37Z
dc.date.issued2000-10-01en_NZ
dc.description.abstractDespite its simplicity, the naive Bayes learning scheme performs well on most classification tasks, and is often significantly more accurate than more sophisticated methods. Although the probability estimates that it produces can be inaccurate, it often assigns maximum probability to the correct class. This suggests that its good performance might be restricted to situations where the output is categorical. It is therefore interesting to see how it performs in domains where the predicted value is numeric, because in this case, predictions are more sensitive to inaccurate probability estimates. This paper shows how to apply the naive Bayes methodology to numeric prediction (i.e., regression) tasks by modeling the probability distribution of the target value with kernel density estimators, and compares it to linear regression, locally weighted linear regression, and a method that produces “model trees”—decision trees with linear regression functions at the leaves. Although we exhibit an artificial dataset for which naive Bayes is the method of choice, on real-world datasets it is almost uniformly worse than locally weighted linear regression and model trees. The comparison with linear regression depends on the error measure: for one measure naive Bayes performs similarly, while for another it is worse. We also show that standard naive Bayes applied to regression problems by discretizing the target value performs similarly badly.We then present empirical evidence that isolates naive Bayes’ independence assumption as the culprit for its poor performance in the regression setting. These results indicate that the simplistic statistical assumption that naive Bayes makes is indeed more restrictive for regression than for classification.
dc.format.mimetypeapplication/pdf
dc.identifier.doi10.1023/A:1007670802811en_NZ
dc.identifier.eissn1573-0565en_NZ
dc.identifier.issn0885-6125en_NZ
dc.identifier.urihttps://hdl.handle.net/10289/16335
dc.language.isoEnglishen_NZ
dc.publisherSPRINGERen_NZ
dc.relation.isPartOfMACHINE LEARNINGen_NZ
dc.rightsThis is an author’s accepted version of an article published in Machine Learning. © 2020 Springer Nature.
dc.subjectScience & Technologyen_NZ
dc.subjectTechnologyen_NZ
dc.subjectComputer Science, Artificial Intelligenceen_NZ
dc.subjectComputer Scienceen_NZ
dc.subjectnaive Bayesen_NZ
dc.subjectregressionen_NZ
dc.subjectmodel treesen_NZ
dc.subjectlinear regressionen_NZ
dc.subjectlocally weighted regressionen_NZ
dc.subjectCLASSIFICATIONen_NZ
dc.titleNaive Bayes for regressionen_NZ
dc.typeJournal Article
dspace.entity.typePublication
pubs.begin-page5
pubs.end-page25
pubs.issue1en_NZ
pubs.publication-statusPublisheden_NZ
pubs.volume41en_NZ

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
nbr.pdf
Size:
217.2 KB
Format:
Adobe Portable Document Format
Description:
Accepted version

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Research Commons Deposit Agreement 2017.pdf
Size:
188.11 KB
Format:
Adobe Portable Document Format
Description: