Mutter, S., Pfahringer, B.& Holmes, G. (2008). Propositionalisation of Profile Hidden Markov Models for Biological Sequence Analysis. In W. Wobcke & M. Zhang(Eds), Proceedings of 21st Australasian Joint Conference on Artificial Intelligence Auckland, New Zealand, December 1-5, 2008(pp. 278-288 ). Berlin, Germany: Springer.
Permanent Research Commons link: https://hdl.handle.net/10289/1762
Hidden Markov Models are a widely used generative model for analysing sequence data. A variant, Profile Hidden Markov Models are a special case used in Bioinformatics to represent, for example, protein families. In this paper we introduce a simple propositionalisation method for Profile Hidden Markov Models. The method allows the use of PHMMs discriminatively in a classification task. Previously, kernel approaches have been proposed to generate a discriminative description for an HMM, but require the explicit definition of a similarity measure for HMMs. Propositionalisation does not need such a measure and allows the use of any propositional learner including kernel-based approaches. We show empirically that using propositionalisation leads to higher accuracies in comparison with PHMMs on benchmark datasets.