Research Commons
      • Browse 
        • Communities & Collections
        • Titles
        • Authors
        • By Issue Date
        • Subjects
        • Types
        • Series
      • Help 
        • About
        • Collection Policy
        • OA Mandate Guidelines
        • Guidelines FAQ
        • Contact Us
      • My Account 
        • Sign In
        • Register
      View Item 
      •   Research Commons
      • University of Waikato Research
      • Computing and Mathematical Sciences
      • Computing and Mathematical Sciences Papers
      • View Item
      •   Research Commons
      • University of Waikato Research
      • Computing and Mathematical Sciences
      • Computing and Mathematical Sciences Papers
      • View Item
      JavaScript is disabled for your browser. Some features of this site may not work without it.

      Online and offline heuristics for inferring hierarchies of repetitions in sequences

      Nevill-Manning, Craig G.; Witten, Ian H.
      Thumbnail
      Files
      00NM-IHW-Onlineoffline.pdf
      701.4Kb
      DOI
       10.1109/5.892710
      Link
       ieeexplore.ieee.org
      Find in your library  
      Citation
      Export citation
      Nevill-Manning, C.G. & Witten, I.H. (2000). Online and offline heuristics for inferring hierarchies of repetitions in sequences. Proceedings of the IEEE, 88(11), 1745-1755.
      Permanent Research Commons link: https://hdl.handle.net/10289/1319
      Abstract
      Hierarchical dictionary-based compression schemes form a grammar for a text by replacing each repeated string with a production rule. While such schemes usually operate online, making a replacement as soon as repetition is detected, offline operation permits greater freedom in choosing the order of replacement. In this paper, we compare the online method with three offline heuristics for selecting the next substring to replace: longest string first, most common string first, and the string that minimized the size of the grammar locally. Surprisingly, two of the offline techniques, like the online method, run in time linear in the size of the input. We evaluate each technique on artificial and natural sequences. In general, the locally-most-compressive heuristic performs best, followed by most frequent, the online technique, and, lagging by some distance, the longest-first technique.
      Date
      2000
      Type
      Journal Article
      Publisher
      IEEE, New York
      Rights
      This article has been published in the journal: Proc IEEE, ©2008 IEEE.
      Collections
      • Computing and Mathematical Sciences Papers [1455]
      Show full item record  

      Usage

      Downloads, last 12 months
      78
       
       
       

      Usage Statistics

      For this itemFor all of Research Commons

      The University of Waikato - Te Whare Wānanga o WaikatoFeedback and RequestsCopyright and Legal Statement