2000 Working Papers

 
  • Hierarchical document clustering using automatically extracted keyphrases

    Jones, Steve; Mahoui, Malika (University of Waikato, Department of Computer Science, 2000-10)
    In this paper we present a technique for automatically generating hierarchical clusters of documents. Our technique exploits document keyphrases as features of the document space to support clustering. In fact, we cluster ...
  • A comparative transaction log analysis of two computing collections

    Mahoui, Malika; Cunningham, Sally Jo (University of Waikato, Department of Computer Science, 2000-07)
    Transaction logs are invaluable sources of fine-grained information about users’ search behavior. This paper compares the searching behavior of users across two WWW-accessible digital libraries: the New Zealand Digital ...
  • µ-Charts and Z: Extending the translation

    Reeve, Greg; Reeves, Steve (University of Waikato, Department of Computer Science, 2000-08)
    This paper describes extensions and modifications to the µ-charts as given in earlier papers of Philipps and Scholz. The charts are extended to include a command language, integer-valued signals and local integer variables. ...
  • Benchmarking attribute selection techniques for data mining

    Hall, Mark A.; Holmes, Geoffrey (University of Waikato, Department of Computer Science, 2000-07)
    Data engineering is generally considered to be a central issue in the development of data mining applications. The success of many learning schemes, in their attempts to construct models of data, hinges on the reliable ...
  • A development environment for predictive modelling in foods

    Holmes, Geoffrey; Hall, Mark A. (University of Waikato, Department of Computer Science, 2000-07)
    WEKA (Waikato Environment for Knowledge Analysis) is a comprehensive suite of Java class libraries that implement many state-of-the-art machine learning/data mining algorithms. Non-programmers interact with the software ...
  • Correlation-based feature selection of discrete and numeric class machine learning

    Hall, Mark A. (University of Waikato, Department of Computer Science, 2000-05)
    Algorithms for feature selection fall into two broad categories: wrappers that use the learning algorithm itself to evaluate the usefulness of features and filters that evaluate features according to heuristics based on ...
  • One dimensional non-uniform rational B-splines for animation control

    Mahoui, Abdelaziz (University of Waikato, Department of Computer Science, 2000-03)
    Most 3D animation packages use graphical representations called motion graphs to represent the variation in time of the motion parameters. Many use two-dimensional B-splines as animation curves because of their power to ...
  • µ-Charts and Z: hows, whys and wherefores

    Reeve, Greg; Reeves, Steve (University of Waikato, Department of Computer Science, 2000-03)
    In this paper we show, by a series of examples, how the µ-chart formalism can be translated into Z. We give reasons for why this is an interesting and sensible thing to do and what it might be used for.
  • KEA: Practical automatic keyphrase extraction

    Witten, Ian H.; Paynter, Gordon W.; Frank, Eibe; Gutwin, Carl; Nevill-Manning, Craig G. (University of Waikato, Department of Computer Science, 2000-03)
    Keyphrases provide semantic metadata that summarize and characterize documents. This paper describes Kea, an algorithm for automatically extracting keyphrases from text. Kea identifies candidate keyphrases using lexical ...
  • Interactive machine learning–letting users build classifiers

    Ware, Malcolm; Frank, Eibe; Holmes, Geoffrey; Hall, Mark A.; Witten, Ian H. (University of Waikato, Department of Computer Science, 2000-03)
    According to standard procedure, building a classifier is a fully automated process that follows data preparation by a domain expert. In contrast, interactive machine learning engages users in actually generating the ...
  • Text categorization using compression models

    Frank, Eibe; Chui, Chang; Witten, Ian H. (University of Waikato, Department of Computer Science, 2000-01)
    Text categorization, or the assignment of natural language texts to predefined categories based on their content, is of growing importance as the volume of information available on the internet continues to overwhelm us. ...
  • Using compression to identify acronyms in text

    Yeates, Stuart Andrew; Bainbridge, David; Witten, Ian H. (University of Waikato, Department of Computer Science, 2000-01)
    Text mining is about looking for patterns in natural language text, and may be defined as the process of analyzing text to extract information from it for particular purposes. In previous work, we claimed that compression ...

View more