Loading...
Thumbnail Image
Item

Better text compression from fewer lexical n-grams

Abstract
Word-based context models for text compression have the capacity to outperform more simple character-based models, but are generally unattractive because of inherent problems with exponential model growth and corresponding data sparseness. These ill-effects can be mitigated in an adaptive lossless compression scheme by modelling syntactic and semantic lexical dependencies independently.
Type
Conference Contribution
Type of thesis
Series
Citation
Smith, T.C. & Lorenz, M.(2001). Better text compression from fewer lexical n-grams. In Proceedings of Data Compression Conference (DCC ‘01). Washington, DC, USA: IEEE Computer Society.
Date
2001
Publisher
IEEE Computer Society
Degree
Supervisors
Rights
This paper has been published in the Proceedings of Data Compression Conference(DCC ‘01). ©2001 IEEE Computer Society.