Better text compression from fewer lexical n-grams
dc.contributor.author | Smith, Tony C. | |
dc.contributor.author | Lorenz, Michelle | |
dc.coverage.spatial | Conference held at Snowbird, Utah | en_NZ |
dc.date.accessioned | 2008-12-18T20:53:34Z | |
dc.date.available | 2008-12-18T20:53:34Z | |
dc.date.issued | 2001 | |
dc.description.abstract | Word-based context models for text compression have the capacity to outperform more simple character-based models, but are generally unattractive because of inherent problems with exponential model growth and corresponding data sparseness. These ill-effects can be mitigated in an adaptive lossless compression scheme by modelling syntactic and semantic lexical dependencies independently. | en |
dc.format.mimetype | application/pdf | |
dc.identifier.citation | Smith, T.C. & Lorenz, M.(2001). Better text compression from fewer lexical n-grams. In Proceedings of Data Compression Conference (DCC ‘01). Washington, DC, USA: IEEE Computer Society. | en |
dc.identifier.doi | 10.1109/DCC.2001.10047 | en |
dc.identifier.uri | https://hdl.handle.net/10289/1722 | |
dc.language.iso | en | |
dc.publisher | IEEE Computer Society | en |
dc.relation.isPartOf | DCC 2001: IEEE Data Compression Conference | en_NZ |
dc.rights | This paper has been published in the Proceedings of Data Compression Conference(DCC ‘01). ©2001 IEEE Computer Society. | en |
dc.subject | computer science | en |
dc.subject | text compression | en |
dc.subject | Machine learning | |
dc.title | Better text compression from fewer lexical n-grams | en |
dc.type | Conference Contribution | en |
pubs.begin-page | 516 | en_NZ |
pubs.elements-id | 11594 | |
pubs.end-page | 516 | en_NZ |
pubs.finish-date | 2001-03-29 | en_NZ |
pubs.start-date | 2001-03-27 | en_NZ |
Files
Original bundle
1 - 1 of 1
Loading...
- Name:
- Better Text Compression from Fewer N-Grams.pdf
- Size:
- 64.77 KB
- Format:
- Adobe Portable Document Format
License bundle
1 - 1 of 1
No Thumbnail Available
- Name:
- license.txt
- Size:
- 1.79 KB
- Format:
- Item-specific license agreed upon to submission
- Description: