Show simple item record  

dc.contributor.authorWen, Yingying
dc.contributor.authorWitten, Ian H.
dc.contributor.authorWang, Dianhui
dc.coverage.spatialConference held at Perth, Australiaen_NZ
dc.date.accessioned2008-11-14T01:40:04Z
dc.date.available2008-11-14T01:40:04Z
dc.date.issued2003
dc.identifier.citationWen, Y., Witten, I.H. & Wang, D. (2003). Token identification using HMM and PPM models. In AI 2003: Advances in Artificial Intelligence, 16th Australian Conference on AI, Perth, Australia, December 3-5, 2003, Proceedings (pp. 173-185). Berlin: Springer.en_US
dc.identifier.urihttps://hdl.handle.net/10289/1335
dc.description.abstractHidden markov models (HMMs) and prediction by partial matching models (PPM) have been successfully used in language processing tasks including learning-based token identification. Most of the existing systems are domain- and language-dependent. The power of retargetability and applicability of these systems is limited. This paper investigates the effect of the combination of HMMs and PPM on token identification. We implement a system that bridges the two well known methods through words new to the identification model. The system is fully domain- and language-independent. No changes of code are necessary when applying to other domains or languages. The only required input of the system is an annotated corpus. The system has been tested on two corpora and achieved an overall F-measure of 69.02% for TCC, and 76.59% for BIB. Although the performance is not as good as that obtained from a system with language-dependent components, our proposed system has power to deal with large scope of domain- and language-independent problem. Identification of date has the best result, 73% and 92% of correct tokens are identified for two corpora respectively. The system also performs reasonably well on people s name with correct tokens of 68% for TCC, and 76% for BIB.en_US
dc.language.isoen
dc.publisherSpringeren_US
dc.relation.urihttp://www.springerlink.com/content/r00vnubyw1jw09wt/en_US
dc.sourceAI 2003en_NZ
dc.subjectcomputer scienceen_US
dc.subjectMachine learning
dc.titleToken identification using HMM and PPM modelsen_US
dc.typeConference Contributionen_US
dc.identifier.doi10.1007/978-3-540-24581-0_15en_US
dc.relation.isPartOfProc 16th Australian Conference: Advances in Artificial Intelligenceen_NZ
pubs.begin-page173en_NZ
pubs.elements-id18442
pubs.end-page185en_NZ
pubs.finish-date2003-12-05en_NZ
pubs.start-date2003-12-03en_NZ
pubs.volume2903en_NZ


Files in this item

FilesSizeFormatView

There are no files associated with this item.

This item appears in the following Collection(s)

Show simple item record