Unsupervised discretization using tree-based density estimation
Authors
Loading...
Permanent Link
Publisher link
Rights
This is an author’s accepted version of a conference paper published in the Proceedings of the 9th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD 2005). © 2005 Springer.
Abstract
This paper presents an unsupervised discretization method that performs density estimation for univariate data. The subintervals that the discretization produces can be used as the bins of a histogram. Histograms are a very simple and broadly understood means for displaying data, and our method automatically adapts bin widths to the data. It uses the log-likelihood as the scoring function to select cut points and the cross-validated log-likelihood to select the number of intervals. We compare this method with equal-width discretization where we also select the number of bins using the cross-validated log-likelihood and with equal-frequency discretization.
Citation
Schmidberger, G. & Frank, E. (2005). Unsupervised discretization using tree-based density estimation. In A. Jorge et al. (Eds), Proceedings of 9th European Conference on Principles and Practice of Knowledge Discovery in Databases, Porto, Portugal, October 3-7, 2005. (pp. 240-251). Berlin: Springer.
Series name
Date
Publisher
Springer-Verlag Berlin Heidelberg