Using compression to identify acronyms in text

dc.contributor.author	Yeates, Stuart Andrew
dc.contributor.author	Bainbridge, David
dc.contributor.author	Witten, Ian H.
dc.date.accessioned	2008-10-10T04:07:46Z
dc.date.available	2008-10-10T04:07:46Z
dc.date.issued	2000-01
dc.description.abstract	Text mining is about looking for patterns in natural language text, and may be defined as the process of analyzing text to extract information from it for particular purposes. In previous work, we claimed that compression is a key technology for text mining, and backed this up with a study that showed how particular kinds of lexical tokens - names, dates, locations, etc. - can be identified and located in running text, using compression models to provide the leverage necessary to distinguish different token types.	en_US
dc.format.mimetype	application/pdf
dc.identifier.citation	Yeates, S., Bainbridge, D. & Witten, I.H. (2000). Using compression to identify acronyms in text. (Working paper 00/01). Hamilton, New Zealand: University of Waikato, Department of Computer Science.	en_US
dc.identifier.issn	1170-487X
dc.identifier.uri	https://hdl.handle.net/10289/1018
dc.language.iso	en
dc.publisher	University of Waikato, Department of Computer Science	en_US
dc.relation.ispartofseries	Computer Science Working Papers
dc.subject	computer science	en_US
dc.subject	text mining	en_US
dc.subject	compression	en_US
dc.subject	Machine learning
dc.title	Using compression to identify acronyms in text	en_US
dc.type	Working Paper	en_US
uow.relation.series	00/01