Hempstalk, K. & Frank, E. (2008). Discriminating Against New Classes: One-class versus Multi-class Classification. In W. Wobcke & M. Zhang (Eds.), Proceedings 21st Australasian Joint Conference on Artificial Intelligence Auckland, New Zealand, December 1-5, 2008. (pp. 325-336). Berlin: Springer.
Permanent Research Commons link: http://hdl.handle.net/10289/1731
Many applications require the ability to identify data that is anomalous with respect to a target group of observations, in the sense of belonging to a new, previously unseen ‘attacker’ class. One possible approach to this kind of verification problem is one-class classification, learning a description of the target class concerned based solely on data from this class. However, if known non-target classes are available at training time, it is also possible to use standard multi-class or two-class classification, exploiting the negative data to infer a description of the target class. In this paper we assume that this scenario holds and investigate under what conditions multi-class and two-class Naïve Bayes classifiers are preferable to the corresponding one-class model when the aim is to identify examples from a new ‘attacker’ class. To this end we first identify a way of performing a fair comparison between the techniques concerned and present an adaptation of standard cross-validation. This is one of the main contributions of the paper. Based on the experimental results obtained, we then show under what conditions which group of techniques is likely to be preferable. Our main finding is that multi-class and two-class classification becomes preferable to one-class classification when a sufficiently large number of non-target classes is available.