Feature subset selection: a correlation based filter approach

Hall, Mark A.; Smith, Lloyd A.

Item

Feature subset selection: a correlation based filter approach

Hall, Mark A.
;
Smith, Lloyd A.

Abstract

Recent work has shown that feature subset selection can have a position affect on the performance of machine learning algorithms. Some algorithms can be slowed or their performance adversely affected by too much data some of which may be irrelevant or redundant to the learning task. Feature subset selection, then, is a method of enhancing the performance of learning algorithms, reducing the hypothesis search space, and, in some cases, reducing the storage requirement. This paper describes a feature subset selector that uses a correlation based heuristic to determine the goodness of feature subsets, and evaluates its effectiveness with three common ML algorithms: a decision tree inducer (C4.5), a naive Bayes classifier, and an instance based learner(IBI). Experiments using a number of standard data sets drawn from real and artificial domains are presented. Feature subset selection gave significant improvement for all three algorithms; C4.5 generated smaller decision trees.

Type

Conference Contribution

Citation

Hall, M. A. & Smith, L. A. (1997). Feature subset selection: a correlation based filter approach. In 1997 International Conference on Neural Information Processing and Intelligent Information Systems (pp. 855-858). Berlin: Springer.

Date

1997

Publisher

Springer

Rights

This is an author’s version of an article published in 1997 International Conference on Neural Information Processing and Intelligent Information Systems. © Springer.

Feature subset selection: a correlation based filter approach

Hall, Mark A.
;
Smith, Lloyd A.

Abstract

Type

Type of thesis

Series

Citation

Date

Publisher

Degree

Supervisors

Rights

Files

Permanent link

DOI

Publisher version

Collections