Loading...
Thumbnail Image
Item

Feature selection for discrete and numeric class machine learning

Abstract
Algorithms for feature selection fall into two broad categories: wrappers use the learning algorithm itself to evaluate the usefulness of features, while filters evaluate features according to heuristics based on general characteristics of the data. For application to large databases, filters have proven to be more practical than wrappers because they are much faster. However, most existing filter algorithms only work with discrete classification problems. This paper describes a fast, correlation-based filter algorithm that can be applied to continuous and discrete problems. Experiments using the new method as a preprocessing step for naïve Bayes, instance-based learning, decision trees, locally weighted regression, and model trees show it to be an effective feature selector - it reduces the data in dimensionality by more than sixty percent in most cases without negatively affecting accuracy. Also, decision and model trees built from the pre-processed data are often significantly smaller.
Type
Working Paper
Type of thesis
Series
Computer Science Working Papers
Citation
Hall, M.A. (1999). Feature selection for discrete and numeric class machine learning. (Working paper 99/04). Hamilton, New Zealand: University of Waikato, Department of Computer Science.
Date
1999-04
Publisher
Computer Science, University of Waikato
Degree
Supervisors
Rights