Using compression to identify acronyms in text

dc.contributor.authorYeates, Stuart Andrew
dc.contributor.authorBainbridge, David
dc.contributor.authorWitten, Ian H.
dc.date.accessioned2008-10-10T04:07:46Z
dc.date.available2008-10-10T04:07:46Z
dc.date.issued2000-01
dc.description.abstractText mining is about looking for patterns in natural language text, and may be defined as the process of analyzing text to extract information from it for particular purposes. In previous work, we claimed that compression is a key technology for text mining, and backed this up with a study that showed how particular kinds of lexical tokens - names, dates, locations, etc. - can be identified and located in running text, using compression models to provide the leverage necessary to distinguish different token types.en_US
dc.format.mimetypeapplication/pdf
dc.identifier.citationYeates, S., Bainbridge, D. & Witten, I.H. (2000). Using compression to identify acronyms in text. (Working paper 00/01). Hamilton, New Zealand: University of Waikato, Department of Computer Science.en_US
dc.identifier.issn1170-487X
dc.identifier.urihttps://hdl.handle.net/10289/1018
dc.language.isoen
dc.publisherUniversity of Waikato, Department of Computer Scienceen_US
dc.relation.ispartofseriesComputer Science Working Papers
dc.subjectcomputer scienceen_US
dc.subjecttext miningen_US
dc.subjectcompressionen_US
dc.subjectMachine learning
dc.titleUsing compression to identify acronyms in texten_US
dc.typeWorking Paperen_US
uow.relation.series00/01
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
uow-cs-wp-2000-01.pdf
Size:
726.6 KB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.79 KB
Format:
Item-specific license agreed upon to submission
Description: