Publication:
Text categorization using compression models

dc.contributor.authorFrank, Eibeen_NZ
dc.contributor.authorChui, Changen_NZ
dc.contributor.authorWitten, Ian H.en_NZ
dc.date.accessioned2024-08-29T02:30:52Z
dc.date.available2024-08-29T02:30:52Z
dc.date.issued2000en_NZ
dc.descriptionwaiting for verification
dc.description.abstractText categorization, or the assignment of natural language texts to predefined categories based on their content, is of growing importance as the volume of information available on the internet continues to overwhelm us. The use of predefined categories implies a “supervised learning” approach to categorization, where already-classified articles—which effectively define the categories—are used as “training data” to build a model that can be used for classifying new articles that comprise the “test data.” This contrasts with “unsupervised” learning, where there is no training data and clusters of like documents are sought amongst the test articles. With supervised learning, meaningful labels (such as keyphrases) are attached to the training documents, and appropriate labels can be assigned automatically to test documents depending on which category they fall into.
dc.identifier.urihttps://hdl.handle.net/10289/16852
dc.language.isoen
dc.publisherDepartment of Computer Science, University of Waikato
dc.rights© The Authors 2000
dc.titleText categorization using compression modelsen_NZ
dc.typeWorking Paper
dspace.entity.typePublication
pubs.confidentialfalseen_NZ
pubs.place-of-publicationHamilton, New Zealand

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Frank_categorization.full.pdf
Size:
153.91 KB
Format:
Adobe Portable Document Format
Description:
Published version

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.7 KB
Format:
Item-specific license agreed upon to submission
Description: