Using compression to identify acronyms in text
Citation
Export citationYeates, S., Bainbridge, D. & Witten, I.H. (2000). Using compression to identify acronyms in text. (Working paper 00/01). Hamilton, New Zealand: University of Waikato, Department of Computer Science.
Permanent Research Commons link: https://hdl.handle.net/10289/1018
Abstract
Text mining is about looking for patterns in natural language text, and may be defined as the process of analyzing text to extract information from it for particular purposes. In previous work, we claimed that compression is a key technology for text mining, and backed this up with a study that showed how particular kinds of lexical tokens - names, dates, locations, etc. - can be identified and located in running text, using compression models to provide the leverage necessary to distinguish different token types.
Date
2000-01Type
Report No.
00/01
Publisher
University of Waikato, Department of Computer Science
Collections
- 2000 Working Papers [12]