Research Commons
      • Browse 
        • Communities & Collections
        • Titles
        • Authors
        • By Issue Date
        • Subjects
        • Types
        • Series
      • Help 
        • About
        • Collection Policy
        • OA Mandate Guidelines
        • Guidelines FAQ
        • Contact Us
      • My Account 
        • Sign In
        • Register
      View Item 
      •   Research Commons
      • University of Waikato Research
      • Computing and Mathematical Sciences
      • Computing and Mathematical Sciences Papers
      • View Item
      •   Research Commons
      • University of Waikato Research
      • Computing and Mathematical Sciences
      • Computing and Mathematical Sciences Papers
      • View Item
      JavaScript is disabled for your browser. Some features of this site may not work without it.

      Adaptive text mining: Inferring structure from sequences

      Witten, Ian H.
      Thumbnail
      Files
      Adaptive Text Mining.pdf
      107.7Kb
      DOI
       10.1016/j.jda.2004.04.010
      Link
       www.sciencedirect.com
      Find in your library  
      Citation
      Export citation
      Witten, I.H. (2004). Adaptive text mining: Inferring structure from sequences. Journal of Discrete Algorithms, 2(2), pp. 137-159.
      Permanent Research Commons link: https://hdl.handle.net/10289/1296
      Abstract
      Text mining is about inferring structure from sequences representing natural language text, and may be defined as the process of analyzing text to extract information that is useful for particular purposes. Although hand-crafted heuristics are a common practical approach for extracting information from text, a general, and generalizable, approach requires adaptive techniques. This paper studies the way in which the adaptive techniques used in text compression can be applied to text mining. It develops several examples: extraction of hierarchical phrase structures from text, identification of keyphrases in documents, locating proper names and quantities of interest in a piece of text, text categorization, word segmentation, acronym extraction, and structure recognition. We conclude that compression forms a sound unifying principle that allows many text mining problems to be tacked adaptively.
      Date
      2004
      Type
      Journal Article
      Publisher
      Elsevier B.V.
      Rights
      This is an author’s version of an article published in the Journal of Discrete Algorithms, (c) 2008 Elsevier B.V.
      Collections
      • Computing and Mathematical Sciences Papers [1455]
      Show full item record  

      Usage

      Downloads, last 12 months
      117
       
       
       

      Usage Statistics

      For this itemFor all of Research Commons

      The University of Waikato - Te Whare Wānanga o WaikatoFeedback and RequestsCopyright and Legal Statement