Show simple item record  

dc.contributor.authorFrank, Eibe
dc.contributor.authorChui, Chang
dc.contributor.authorWitten, Ian H.
dc.date.accessioned2008-10-10T04:10:08Z
dc.date.available2008-10-10T04:10:08Z
dc.date.issued2000-01
dc.identifier.citationFrank, E., Chui, C. & Witten, I.H. (2000). Text categorization using compression models. (Working paper 00/02). Hamilton, New Zealand: University of Waikato, Department of Computer Science.en_US
dc.identifier.issn1170-487X
dc.identifier.urihttps://hdl.handle.net/10289/1019
dc.description.abstractText categorization, or the assignment of natural language texts to predefined categories based on their content, is of growing importance as the volume of information available on the internet continues to overwhelm us. The use of predefined categories implies a “supervised learning” approach to categorization, where already-classified articles which effectively define the categories are used as “training data” to build a model that can be used for classifying new articles that comprise the “test data”. This contrasts with “unsupervised” learning, where there is no training data and clusters of like documents are sought amongst the test articles. With supervised learning, meaningful labels (such as keyphrases) are attached to the training documents, and appropriate labels can be assigned automatically to test documents depending on which category they fall into.en_US
dc.format.mimetypeapplication/pdf
dc.language.isoen
dc.publisherUniversity of Waikato, Department of Computer Scienceen_US
dc.relation.ispartofseriesComputer Science Working Papers
dc.subjectcomputer scienceen_US
dc.subjecttext categorizationen_US
dc.subjectcompressionen_US
dc.subjectMachine learning
dc.titleText categorization using compression modelsen_US
dc.typeWorking Paperen_US
uow.relation.series00/02


Files in this item

This item appears in the following Collection(s)

Show simple item record