2001 Working Papers


Recent Submissions

  • Publication
    A simple approach to ordinal classification
    (Working Paper, Department of Computer Science, University of Waikato, 2001) Frank, Eibe; Hall, Mark A.
    Machine learning methods for classification problems commonly assume that the class values are unordered. However, in many practical applications, the class values do exhibit a natural order—for example, when learning how to grade. The standard approach to ordinal classification converts the class value into a numeric quantity and applies a regression learner to the transformed data, translating the output back into a discrete class value in a post-processing step. A disadvantage of this method is that it can only be applied in conjunction with a regression scheme. In this paper, we present a simple method that enables standard classification algorithms to make use of ordering information in class attributes. By applying it in conjunction with a decision tree learner, we show that it outperforms the naive approach, which treats the class values as an unordered set. Compared to special-purpose algorithms for ordinal classification, our method has the advantage that it can be applied without any modification to the underlying learning scheme.
  • Publication
    Data structures for Z testing tools.
    (Working Paper, University of Waikato, Department of Computer Science, 2001-06-01) Utting, Mark
    This paper describes some of the difficulties and challenges that arise during the design of tools for validating Z specifications by testing and animation. We address three issues: handling undefined terms, simplification versus enumeration, and the representation of sets, and show how each issue has been handled in the Jaza tool. Finally, we report on a brief experimental comparison of existing Z animators and conclude that while the state of the art is improving, more work is needed to ensure that the tools are robust and respect the Z semantics.
  • Publication
    Interactive document summarisation.
    (Working Paper, University of Waikato, Department of Computer Science, 2001-02-01) Jones, Steve; Lundy, Stephen; Paynter, Gordon W.
    This paper describes the Interactive Document Summariser (IDS), a dynamic document summarisation system, which can help users of digital libraries to access on-line documents more effectively. IDS provides dynamic control over summary characteristics, such as length and topic focus, so that changes made by the user are instantly reflected in an on-screen summary. A range of 'summary-in-context' views support seamless transitions between summaries and their source documents. IDS creates summaries by extracting keyphrases from a document with the Kea system, scoring sentences according to the keyphrases that they contain, and then extracting the highest scoring sentences. We report an evaluation of IDS summaries, in which human assessors identified suitable summary sentences in source documents, against which IDS summaries were judged. We found that IDS summaries were better than baseline summaries, and identify the characteristics of Kea keyphrases that lead to the best summaries.
  • Publication
    A simple approach to ordinal classification.
    (Working Paper, University of Waikato, Department of Computer Science, 2001-11-01) Frank, Eibe; Hall, Mark A.
    Machine learning methods for classification problems commonly assume that the class values are unordered. However, in many practical applications the class values do exhibit a nature order, for example, when learning how to grade. The standard approach to ordinal classification converts the class value into numeric quantity and applies a regression learner to the transformed data, translating the output back into a discrete class value in a post-processing step. A disadvantage of this method is that it can only be applied in conjunction with a regression scheme. In this paper we present a simple method that enables standard classification algorithms to make use of ordering information in class attributes. By applying it in conjunction with a decision tree learner we show that it outperforms the naïve approach, which treats the class values as an unordered set. Compared to special-purpose algorithms for ordinal classification our method has the advantage that it can be applied without any modification to the underlying learning scheme.
  • Publication
    Human evaluation of Kea, an automatic keyphrasing system.
    (Working Paper, University of Waikato, Department of Computer Science, 2001-02-01) Jones, Steve; Paynter, Gordon W.
    This paper describes an evaluation of the Kea automatic keyphrase extraction algorithm. Tools that automatically identify keyphrases are desirable because document keyphrases have numerous applications in digital library systems, but are costly and time consuming to manually assign. Keyphrase extraction algorithms are usually evaluated by comparison to author-specified keywords, but this methodology has several well-known shortcomings. The results presented in this paper are based on subjective evaluations of the quality and appropriateness of keyphrases by human assessors, and make a number of contributions. First, they validate previous evaluations of Kea that rely on author keywords. Second, they show Kea's performance is comparable to that of similar systems that have been evaluated by human assessors. Finally, they justify the use of author keyphrases as a performance metric by showing that authors generally choose good keywords.
  • Publication
    Experiences using Z animation tools.
    (Working Paper, University of Waikato, Department of Computer Science, 2001-05-01) Reeve, Greg; Reeves, Steve
    In this paper we describe our experience of using three different animation systems. We searched for and decided to use these tools in the context of a project which involved developing formal versions (in Z) of informal requirements documents, and then showing the formal versions to people in industry who were not Z users (or users of any formal techniques). So, an animator seemed a good way of showing the behaviour of a system described formally without the audience having to learn Z. A requirement, however, that the tools used have to satisfy is that they correctly animated Z (whatever that may mean) and they behave adequately in terms of speed and presentation. We have to report that none of the tools we looked at satisfy these requirements--though to be fair all of them are still under development.