Research Commons
      • Browse 
        • Communities & Collections
        • Titles
        • Authors
        • By Issue Date
        • Subjects
        • Types
        • Series
      • Help 
        • About
        • Collection Policy
        • OA Mandate Guidelines
        • Guidelines FAQ
        • Contact Us
      • My Account 
        • Sign In
        • Register
      View Item 
      •   Research Commons
      • University of Waikato Research
      • Computing and Mathematical Sciences
      • Computing and Mathematical Sciences Papers
      • View Item
      •   Research Commons
      • University of Waikato Research
      • Computing and Mathematical Sciences
      • Computing and Mathematical Sciences Papers
      • View Item
      JavaScript is disabled for your browser. Some features of this site may not work without it.

      Online evaluation of email streaming classifiers using GNUsmail

      Carmona-Cejudo, José M.; Baena-García, Manuel; Campo-Ávila, José; Bifet, Albert; Gama, João; Morales-Bueno, Rafael
      DOI
       10.1007/978-3-642-24800-9_11
      Link
       link.springer.com
      Find in your library  
      Citation
      Export citation
      Carmona-Cejudo, J. M., Baena-García, M., Campo-Ávila, J., Bifet, A., Gama, J., & Morales-Bueno, R. (2011). Lecture Notes in Computer Science. (D. Hutchison, T. Kanade, J. Kittler, J. M. Kleinberg, F. Mattern, J. C. Mitchell, & J. Hollmén, Eds.) (Vol. 7014, pp. 90-100).
      Permanent Research Commons link: https://hdl.handle.net/10289/6957
      Abstract
      Real-time email classification is a challenging task because of its online nature, subject to concept-drift. Identifying spam, where only two labels exist, has received great attention in the literature. We are nevertheless interested in classification involving multiple folders, which is an additional source of complexity. Moreover, neither cross-validation nor other sampling procedures are suitable for data streams evaluation. Therefore, other metrics, like the prequential error, have been proposed. However, the prequential error poses some problems, which can be alleviated by using mechanisms such as fading factors.

      In this paper we present GNUsmail, an open-source extensible framework for email classification, and focus on its ability to perform online evaluation. GNUsmail’s architecture supports incremental and online learning, and it can be used to compare different online mining methods, using state-of-art evaluation metrics.

      We show how GNUsmail can be used to compare different algorithms, including a tool for launching replicable experiments.
      Date
      2011
      Type
      Conference Contribution
      Publisher
      Springer
      Collections
      • Computing and Mathematical Sciences Papers [1455]
      Show full item record  

      Usage

       
       
       

      Usage Statistics

      For this itemFor all of Research Commons

      The University of Waikato - Te Whare Wānanga o WaikatoFeedback and RequestsCopyright and Legal Statement