Research Commons
      • Browse 
        • Communities & Collections
        • Titles
        • Authors
        • By Issue Date
        • Subjects
        • Types
        • Series
      • Help 
        • About
        • Collection Policy
        • OA Mandate Guidelines
        • Guidelines FAQ
        • Contact Us
      • My Account 
        • Sign In
        • Register
      View Item 
      •   Research Commons
      • University of Waikato Research
      • Computing and Mathematical Sciences
      • Computer Science Working Paper Series
      • 1999 Working Papers
      • View Item
      •   Research Commons
      • University of Waikato Research
      • Computing and Mathematical Sciences
      • Computer Science Working Paper Series
      • 1999 Working Papers
      • View Item
      JavaScript is disabled for your browser. Some features of this site may not work without it.

      Lexical attraction for text compression

      Bach, Joscha; Witten, Ian H.
      Thumbnail
      Files
      uow-cs-wp-1999-01.pdf
      761.1Kb
      DOI
       10.1109/DCC.1999.785673
      Find in your library  
      Citation
      Export citation
      Bach, J. & Witten, I.H. (1999). Lexical attraction for text compression. (Working paper 99/01). Hamilton, New Zealand: University of Waikato, Department of Computer Science.
      Permanent Research Commons link: https://hdl.handle.net/10289/1030
      Abstract
      New methods of acquiring structural information in text documents may support better compression by identifying an appropriate prediction context for each symbol. The method of “lexical attraction” infers syntactic dependency structures from statistical analysis of large corpora. We describe the generation of a lexical attraction model, discuss its application to text compression, and explore its potential to outperform fixed-context models such as word-level PPM. Perhaps the most exciting aspect of this work is the prospect of using compression as a metric for structure discovery in text.
      Date
      1999-01
      Type
      Working Paper
      Series
      Computer Science Working Papers
      Report No.
      99/01
      Publisher
      Computer Science, University of Waikato
      Collections
      • 1999 Working Papers [16]
      Show full item record  

      Usage

      Downloads, last 12 months
      129
       
       
       

      Usage Statistics

      For this itemFor all of Research Commons

      The University of Waikato - Te Whare Wānanga o WaikatoFeedback and RequestsCopyright and Legal Statement