Making better use of global discretization

Frank, Eibe; Witten, Ian H.

Making better use of global discretization

Authors

Frank, Eibe

Witten, Ian H.

Files

making better use of global discretization.pdf (184.47 KB)

Permanent Link

https://hdl.handle.net/10289/1507

Rights

Abstract

Before applying learning algorithms to datasets, practitioners often globally discretize any numeric attributes. If the algorithm cannot handle numeric attributes directly, prior discretization is essential. Even if it can, prior discretization often accelerates induction, and may produce simpler and more accurate classifiers. As it is generally done, global discretization denies the learning algorithm any chance of taking advantage of the ordering information implicit in numeric attributes. However, a simple transformation of discretized data preserves this information in a form that learners can use. We show that, compared to using the discretized data directly, this transformation significantly increases the accuracy of decision trees built by C4.5, decision lists built by PART, and decision tables built using the wrapper method, on several bench-mark datasets. Moreover, it can significantly reduce the size of the resulting classifiers. This simple technique makes global discretization an even more useful tool for data preprocessing

Citation

Frank, E. & Witten, I.H.(1999). Making better use of global discretization. In Proceeding of 16th International Conference on Machine Learning, Bled, Slovenia (pp. 115-123). San Francisco: Morgan Kaufmann Publishers.

Type

Conference Contribution

Date

1999

Publisher

Morgan Kaufmann Publishers Inc., San Francisco, CA, USA

Making better use of global discretization

Authors

Files

Permanent Link

Publisher link

Rights

Abstract

Citation

Type

Series name

Date

Publisher

Degree

Type of thesis

Supervisor