Propositionalisation of Profile Hidden Markov Models for Biological Sequence Analysis
Abstract
Hidden Markov Models are a widely used generative model for analysing sequence data. A variant, Profile Hidden Markov Models are a special case used in Bioinformatics to represent, for example, protein families. In this paper we introduce a simple propositionalisation method for Profile Hidden Markov Models. The method allows the use of PHMMs discriminatively in a classification task. Previously, kernel approaches have been proposed to generate a discriminative description for an HMM, but require the explicit definition of a similarity measure for HMMs. Propositionalisation does not need such a measure and allows the use of any propositional learner including kernel-based approaches. We show empirically that using propositionalisation leads to higher accuracies in comparison with PHMMs on benchmark datasets.
Type
Conference Contribution
Type of thesis
Series
Citation
Mutter, S., Pfahringer, B.& Holmes, G. (2008). Propositionalisation of Profile Hidden Markov Models for Biological Sequence Analysis. In W. Wobcke & M. Zhang(Eds), Proceedings of 21st Australasian Joint Conference on Artificial Intelligence Auckland, New Zealand, December 1-5, 2008(pp. 278-288 ). Berlin, Germany: Springer.
Date
2008
Publisher
Springer