Research Commons
      • Browse 
        • Communities & Collections
        • Titles
        • Authors
        • By Issue Date
        • Subjects
        • Types
        • Series
      • Help 
        • About
        • Collection Policy
        • OA Mandate Guidelines
        • Guidelines FAQ
        • Contact Us
      • My Account 
        • Sign In
        • Register
      View Item 
      •   Research Commons
      • University of Waikato Research
      • Computing and Mathematical Sciences
      • Computer Science Working Paper Series
      • 1996 Working Papers
      • View Item
      •   Research Commons
      • University of Waikato Research
      • Computing and Mathematical Sciences
      • Computer Science Working Paper Series
      • 1996 Working Papers
      • View Item
      JavaScript is disabled for your browser. Some features of this site may not work without it.

      Identifying hierarchical structure in sequences: a linear-time algorithm

      Nevill-Manning, Craig G.; Witten, Ian H.
      Thumbnail
      Files
      uow-cs-wp-1996-25.pdf
      2.976Mb
      DOI
       10.1613/jair.374
      Find in your library  
      Citation
      Export citation
      Nevill-Manning, C. G. & Witten, I. H. (1996). Authorship patterns in information systems. (Working paper 96/25). Hamilton, New Zealand: University of Waikato, Department of Computer Science.
      Permanent Research Commons link: https://hdl.handle.net/10289/1186
      Abstract
      This paper describes an algorithm that infers a hierarchical structure from a sequence of discrete symbols by replacing phrases which appear more than once by a grammatical rule that generates the phrase, and continuing this process recursively. The result is a hierarchical representation of the original sequence. The algorithm works by maintaining two constraints: every diagram in the grammar must be unique, and every rule must be used more than once. It breaks new ground by operating incrementally. Moreover, its simple structure permits a proof that it operates in space and time that is linear in the size of the input. Our implementation can process 10,000 symbols/second and has been applied to an extensive range of sequences encountered in practice.
      Date
      1996-11
      Type
      Working Paper
      Series
      Computer Science Working Papers
      Report No.
      96/25
      Collections
      • 1996 Working Papers [32]
      Show full item record  

      Usage

      Downloads, last 12 months
      80
       
       
       

      Usage Statistics

      For this itemFor all of Research Commons

      The University of Waikato - Te Whare Wānanga o WaikatoFeedback and RequestsCopyright and Legal Statement