Publication

Unsupervised discretization using tree-based density estimation

Abstract
This paper presents an unsupervised discretization method that performs density estimation for univariate data. The subintervals that the discretization produces can be used as the bins of a histogram. Histograms are a very simple and broadly understood means for displaying data, and our method automatically adapts bin widths to the data. It uses the log-likelihood as the scoring function to select cut points and the cross-validated log-likelihood to select the number of intervals. We compare this method with equal-width discretization where we also select the number of bins using the cross-validated log-likelihood and with equal-frequency discretization.
Type
Conference Contribution
Type of thesis
Series
Citation
Schmidberger, G. & Frank, E. (2005). Unsupervised discretization using tree-based density estimation. In A. Jorge et al. (Eds), Proceedings of 9th European Conference on Principles and Practice of Knowledge Discovery in Databases, Porto, Portugal, October 3-7, 2005. (pp. 240-251). Berlin: Springer.
Date
2005
Publisher
Springer, Berlin
Degree
Supervisors
Rights